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Abstract 



Anaphora resolution is one of the major problems in natural language processing. 
It is also one of the important tasks in machine translation and man/machine di- 
alogue. We solve the problem by using surface expressions and examples. Surface 
expressions are the words in sentences which provide clues for anaphora resolu- 
tion. Examples are linguistic data which are actually used in conversations and 
texts. The method using surface expressions and examples is a practical method. 
This thesis handles almost all kinds of anaphora. 

1. The referential property and number of a noun phrase 

2. Noun phrase direct anaphora 

3. Noun phrase indirect anaphora 

4. Pronoun anaphora 

5. Verb phrase ellipsis 



Pronoun anaphora has been investigated by many researchers |Nagao et al 76 1 



[ Kameyama 86] ] |Yamamura et al 92 1 |Takada & Doi 94] [ Nakaiwa fc Ikehara 95| ] 



We used their results in addition to our new methods. In other areas of anaphora 
resolution, there are scarcely any empirical works and thus this thesis breaks new 
ground. In this thesis, the above five computer anaphora resolutions are described 
in Chapter ^ through Chapter p. 

Chapter |2| shows that the referential property and number of noun phrases can 
be estimated fairly reliably by the words in Japanese sentences (surface expres- 
sions). The referential property and number of a noun phrase are basic factors in 
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anaphora resolution. The system can grasp the outhne of the referent of the noun 
phrase by using the referential property and number of a noun phrase. Many rules 
for the estimation of the referential property and number are written in forms sim- 
ilar to rewriting rules in expert systems with scores. We tested and verified the 
effectiveness of this method. 

Chapter |3| describes a method for estimating the referent of a noun phrase in 
Japanese sentences using referential properties, modifiers, and possessors of noun 
phrases. In this analysis, referential properties are very important. For example, if 
the referential property of a noun phrase is definite, the noun phrase can refer to a 
previous noun phrase, and if the referential property of a noun phrase is indefinite, 
the noun phrase cannot refer to a previous noun phrase. Furthermore, we more 
precisely estimated referents of noun phrases using modifiers and possessors of 
noun phrases. We verified in our experiment the effectiveness of using referential 
properties, modifiers, and possessors of noun phrases. 

Chapter ^ describes how to resolve indirect anaphora resolution. A noun 
phrase can indirectly refer to an entity that has already been mentioned before. 
For example, "There is a house. The roof is white." indicates that "the roof" is 
associated with "a house", which was previously mentioned. When we analyze 
indirect anaphora, we need a case frame dictionary for nouns containing informa- 
tion about relationships between two nouns. But no noun case frame dictionary 
exists at present. Therefore, we used examples of "X of Y" and a verb case frame 
dictionary. We tested and verified that the information of "X of Y" is useful when 
we cannot make use of a noun case frame dictionary. We also proposed how to 
construct a noun case frame dictionary from examples of "X of Y" . 

Chapter ^ describes how to estimate the referent of a pronoun in Japanese 
sentences. In conventional work, semantic markers have been used for semantic 
constraints. We used examples for semantic constraints and showed in our ex- 
periments that examples are as useful as semantic markers. We also proposed 
many new methods for estimating referents of pronouns. We experimented with 
pronoun resolutions on some texts and verified the effectiveness of our methods. 

Chapter ^ describes the method of resolving verb phrase ellipsis using surface 
expressions and examples. When the referent of a verb phrase ellipsis appears 



Ill 



in the sentences, the structure of the elhptical sentence is commonly in a typical 
form and the resolution is done by using surface expressions. When the referent 
does not exist in the sentences, the system resolved the elliptical sentence using 
examples. As the result of the experiment, we obtained a high accuracy rate. 
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Chapter 1 



Introduction 



1.1 Anaphora Resolution 

Natural language understanding is one of many researchers' dreams and has been 
investigated in many areas such as machine translation and man machine dialogue 
[ Winograd 72 1 [ Nagao 84 1 [ Hirst 86 1 [ Hobbs et al 8^ ]. Let us consider what nat- 



ural language understanding is. Although machines will eventually understand 
natural language and be able to talk with humans, they cannot do so at present. 
The first step for natural language understanding is that the machine understands 
the structure of a sentence. It has been investigated in some areas (morpholog- 
ical analysis, syntax analysis, and case analysis), and good results have been 
obtained in some papers [Matsumoto et al 92(| [ [Kurohashi fc Nagao 94 1 Brill 95|| . 



The next step is that the machine understands the object which a word refers 
to, which is called anaphora resolution. Although this has been investigated by 
many researchers, good results have still not been obtained. Therefore we devised 
a practical method to clarify how a word refers to an object. 

What kind of tasks are involved in the resolution of the object which a word 
refers to? At first, the system must recognize what a noun phrase refers to. It 
must also understand whether a noun phrase refers to a specified object or to 
a generic object. When a noun phrase partly relates to a noun phrase which 
has already been mentioned, the system must detect the relation. It must also 
understand what a pronoun or an ellipsis refers to. 
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The above analyses are very important in machine translation and man ma- 
chine dialogue. If an ellipsis is not resolved, machine translation and dialogue pro- 
cessing cannot be performed. If the reference of a word is resolved, the precision 
of generating articles "the/a/an" and pronouns "I/you/he" in machine transla- 
tion will increase. In dialogue system, the number of counter questions to users 
is smaller and the processing is becoming more smooth. 

The following is handled in this thesis: 

1. The referential property and the number of a noun phrase 

The system judges whether a noun phrase refers to a specific object or a 
generic object and estimates the number of the object. 



HON -TOIUNOWA NINGEN-NO SEICHOU-NI KAKASEMASEN. 
(book) (human being) (growth) (be necessary) 

( Books are necessary for the growth of the human being. ) 



(Desired solution: "HON" refers to books in general. 

2. Noun phrase direct anaphora 

The system estimates what a noun phrase represents. 



OJIISAN-WA JIMEN-NI KOSHI-WO-OROSHIMASHITA. 

(old man) (ground) (sit down) 

(The old man sat down on the ground.) 

YAGATE OJIISAN -WA NEMUTTE-SHIMAIMASHITA. 

(soon) (old man) (fall asleep) 

( The old man soon fell asleep.) 

(Desired solution: The underlined word "OJIISAN" refers to "OJI- 
ISAN" in the first sentence. ) 



(1.1) 



fl.2) 
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3. Noun phrase indirect anaphora 

The system estimates the object which a noun phrase indirectly refers to. 
In other words, the system detects the object which a noun phrase relates 
to in context. 



KINOU ARU HURUI lE-NI ITTA. 

(yesterday) (a certain) (old) (house) (go) 

(I went to an old house yesterday.) 

YANE-WA HIDOI AMAMORIDE ... 

(roof) (badly) (be leaking) 

( The roof was leaking badly and ... ) 

(Desired solution: The underlined word "YANE (roof)" is the roof 
of "IE (house)" in the first sentence.) 



(1.3) 



4. Pronoun anaphora 

The system estimates what a pronoun represents. 



KINOU MIKAN-WO KATTA . 

(yesterday) (oranges) (buy) 

(I bought some oranges yesterday.) 

TAROU-NO lE-NI ITTE KORE- WO TABETA. 

(Taroo's) (house) (go) (this) (eat) 

(I went to Taroo's house and ate them.) 

(Desired solution: "KORE" refers to "MIKAN". ) 

5. Verb phrase ellipsis 

The system recovers an omitted verb phrase. 



(1.4) 
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SOU UMAKU IKUTOWA [OMOWANAI] . 
(so) (succeed so well) (I don't think) 

([I don't think] it will succeed so well. ) 

(Desired solution: "OMOWANAI (I don't think)" is recovered.) 

(1.5) 

The area of "^. Pronoun anaphora" has been investigated by many researchers 
Nagao et al 76|| |Kanieyania 86] |Yamainura et al 92] [Takada Sz Doi 94] 



[Nakaiwa & Ikehara 95 1. We used their results in addition to our new methods. 
In the other areas of anaphora resolution, there are scarcely any empirical works. 
So this thesis breaks new ground in this regard. 

1.2 The Method Using Surface Expressions and Ex- 
amples 

In this thesis, we have used much available information available for anaphora 
resolution. We emphasize surface expressions and examples. 

Examples are linguistic data which are actually used in conversations and 
texts. By using examples we can resolve many linguistic problems. For example, 
suppose that we want to clarify the thing which "KORE (this)" represents in the 
following sentences. 

KINOU MIKAN-WO KATTA . 

(yesterday) (oranges) (buy) 

(I bought some oranges yesterday.) 

(1.6) 
TAROU-NO lE-NI ITTE KORE -WO TABETA. 

(Taroo's) (house) (go) (this) (eat) 

(I went to Taroo's house and ate them.) 

In this case, we gather examples such as "RINGO-WO TABERU (I eat apples)" 
and "KEIKI-WO TABERU (I eat cakes)", and extract "RINGO (apple)" and 
"KEIKI (cake)" as the things which correspond to "KORE (this)" . Since "MIKAN 
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(orange)" is semantically similar to "RINGO (apple)" and "KEIKI (cake)" in 
terms of food, we find that it is the antecedent of "KORE (this)". The method 
using examples has a wide application. If we discover examples which are anal- 
ogous to the form of a problem, we can immediately use examples to solve the 
problem^. 

Surface expressions are the clue words in sentences which are used in anaphora 
resolution. For example, suppose that we want to clarify the thing which "HON 
(book)" refers to in the following sentences. 

HON- TOIUNOWA NINGEN-NO SEICHOU-NI KAKASEMASEN. 
(book) (human being) (growth) (be necessary) (l-^) 

( Books are necessary for the growth of human beings. ) 

Since there is a surface expression such as "TOIUNOWA" in this sentence, we 
find that "HON (book)" does not refer to a specific book but refers to books in 
general. Using surface expressions also has a wide application. 

The surface expressions and examples used in this work are as follows. 

• Surface Expression 

— words 

— part-of-speech 

— syntax structure 

• Example 

— the case frame of a verb phrase 

— the semantic relation between two nouns. 

— example sentences 



^ The method of using examples, which is caUed Example-based approach, was proposed for the 



purpose of machine translation [Nagao 84 1. Ahhough this method is used by many researchers 



in machine translation, it is not used in anaphora resolution to our knowledge. 
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1.3 The Overview of Later Chapters 

This thesis describes how to resolve many problems in anaphora by using surface 
expressions and examples. 

Chapter ^ shows that the referential property and number of noun phrases 
can be estimated fairly reliably by the words (surface expressions) in Japanese 
sentences. The referential property and number of a noun phrase are basic factors 
in anaphora resolution. The system can grasp the outline of the referent of the 
noun phrase by using the referential property and number of a noun phrase. Many 
rules for the estimation of the referential property and number are written in forms 
similar to rewriting rules in expert systems with scores. We tested and verified 
the effectiveness of this method. 

Chapter |3| describes a method for estimating the referent of a noun phrase in 
Japanese sentences using referential properties, modifiers, and possessors of noun 
phrases. In the analysis, referential properties are very important. For example 
if the referential property of a noun phrase is definite, the noun phrase can refer 
to a previous noun phrase, and if the referential property of a noun phrase is 
indefinite, the noun phrase cannot refer to a previous noun phrase. Furthermore 
we estimated referents of noun phrases using modifiers and possessors of noun 
phrases more precisely. We made the experiment and verified that it is effective to 
use referential properties, modifiers, and possessors of noun phrases for estimating 
the referent of a noun phrase. 

Chapter B describes how to resolve indirect anaphora resolution. A noun 
phrase can indirectly refer to an entity that has already been mentioned before. 
For example, "There is a house. The roof is white." indicates that "the roof" is 
associated with "a house" , which was mentioned in the previous sentence. When 
we analyze indirect anaphora, we need a case frame dictionary for nouns contain- 
ing the information about relations between two nouns. But no noun case frame 
dictionary exists at present. Therefore, we used examples of "X of Y" and a verb 
case frame dictionary, instead. We made some experiments and verified that the 
information of "X of Y" is useful when we cannot make use of a noun case frame 
dictionary. We also proposed how to construct a noun case frame dictionary from 
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examples of "X of Y" . 

Chapter || describes how to estimate the referent of a pronoun in Japanese 
sentences. In conventional work, semantic markers have been used for semantic 
constraints. We used examples for semantic constraints and show by our ex- 
periments that examples are as useful as semantic markers. We also proposed 
many new methods for estimating referents of pronouns. We experimented with 
pronoun resolutions on some texts and verified the effectiveness of our methods. 

Chapter describes the method of resolving verb phrase ellipsis using surface 
expressions and examples. When the referent of a verb phrase ellipsis appears 
in the sentences, the structure of the elliptical sentence is commonly in a typical 
form and the resolution is done by using surface expressions. When the referent 
does not exist in the sentences, the system resolved the elliptical sentence using 
examples. As the result of the experiment, we obtained a high accuracy rate. 

Chapter is concluding remarks. 



Chapter 2 

An Estimate of the Referential 
Property and the Number of 
Noun Phrase 



2.1 Introduction 

This chapter describes a method for the estimation of the referential property and 
number of a noun phrase by using surface expressions. The referential property of 
a noun phrase represents how the noun phrase denotes the referent. The referential 
property is classified into three types: generic, definite and indefinite. A definite 
noun phrase refers to a given object. An indefinite noun phrase refers to a new 
object. They correspond to a noun phrase with a definite article and a noun 
phrase with an indefinite article in English, respectively. A generic noun phrase 
refers to all objects which the noun phrase denotes. The number of a noun phrase 
is the number of the referent denoted by the noun phrase. The number is classified 
into three types: singular, plural, and uncountable. The referential property and 
number of a noun phrase are basic factors in anaphora resolution. The system 
can grasp the outline of the referent of the noun phrase by using the referential 
property and number of a noun phrase. The referential property and number are 
also useful when the system generates the article in translating Japanese nouns 
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into English. 

This chapter shows that the referential property and number of noun phrases 
can be estimated fairly reliably by words (surface expressions) in the sentence. 
Many rules for the estimation were written in forms similar to rewriting rules in 
expert systems with scores. Since this method uses scores, it is good to deal with 
vague problems like referential properties and numbers. We made the experiment 
estimating the referential property and number of the noun phrase and verified 
that our method is effective. 

2.2 Categories of Referential Property and Number 

2.2.1 Categories of Referential Property 

Referential property of a noun phrase here means how the noun phrase denotes 
the subject. We classified noun phrases into the following three types from the 
referential property. 

{generic noun phrase 
, f definite noun phrase 

non generic noun phrase ^ . , „ ., , 

1 indennite noun phrase 



Generic Noun Plirase A noun phrase is classified as generic when it denotes 
all members of the class of the noun phrase or the class itself of the noun phrase. 
For example, "dogs" in the following sentence is a generic noun phrase. 

Dogs are useful. (2-1) 

Definite Noun Phrase A noun phrase is classified as definite when it denotes a 
contextually non-ambiguous member of the class of the noun phrase. For example, 
"the dog" in the following sentence is a definite noun phrase. 

The dog went away. (2-2) 



10 CHAPTER 2. THE REFERENTIAL PROPERTY AND THE NUMBER 

Indefinite Noun Phrase An indefinite noun phrase denotes an arbitrary mem- 
ber of the class of the noun phrase. For example, the following "dogs" is an 
indefinite noun phrase. 

There are three dogs. (2-3) 



2.2.2 Categories of Number 

The number of a noun phrase is the number of the subject denoted by the noun 
phrase. Categories of number are as follows. 

, , , , f singular noun phrase 

, , countable noun phrase < , , , 

noun phrase <^ I plural noun phrase 

uncountable noun phrase 



Singular Noun Phrase A noun phrase is classified as singular when it denotes 
a singular member of the class of the noun phrase. For example, "a book" in the 
following sentence is singular. 

She brought a book . (2.4) 

Plural Noun Phrase A noun phrase is classified as plural when it denotes 
plural members of the class of the noun phrase. For example, "some books" in 
the following sentence is plural. 

She brought some books . (2-5) 

Uncountable Noun Phrase A noun phrase is classified as uncountable when 
it denotes part of the class of the noun phrase which cannot be divided into 
individuals. For example, "copper" in the following sentence is used as material 
and uncountable. 

Copper conducts heat well. (2-6) 
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"KARE(he)-WA SONO(the)-BENGOSHI(lawyer)-NO(of) 
MUSUKO(son)-NO(of) HITORI(one person)-DESU(is)." 
(He is one of the sons of the lawyer.) 

(a): Japanese sentence 

KARE(he)-WA 1 

SONO(the) 1 I 

BENGOSHI (lawyer) -NO (of) | | 

MUSUKO(son)-NO — I 

HITDRKone person) -DESU(is) 

(b):Dependency structure of sentence(a) 

( <[noun common-noun _ _ 'HITDRI' 'HITORI'] 

[copula _ copula DESU-line-basic-f orm 'DA' 'DESU'] 
[punctuation-mark period _ _ '. ' '. ']> 
( <[noun common-noun _ _ 'MUSUKD' 'MUSUKD'] 

[postpositional-particle noun-connection-postpositional-particle 
_ _ 'NO' 'N0']> 

( <[noun common-noun _ _ 'BENGOSHI' 'BENGOSHI'] 
[postpositional-particle 
noun-connection-postpositional-particle _ _ 'NO' 'N0']> 

( < [demonstrative-adjective 'SONO' 'SONO']> ))) 

( <[noun common-noun _ _ 'RARE' 'KARE'] 

[postpositional-particle topic-marking-postposition _ _ 'WA' 
'WA'] 

[punctuation-mark komma _ _ ', ' ', ']> )) 

(c):Dependency structure representation of sentence(a) 
Figure 2.1: Example of dependency structure representation 
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( < [noun -] - > 

( < [demonstrative-adjective 'SDND' 'SONO']> ) - ) 

Figure 2.2: An expression of the noun modified by "SONO (the)" 



2.3 How to Estimate Referential Property and Num- 
ber 

Heuristic rules for the referential property are given in the form: 
[condition for rule application) 

=^ { inde&nite^possibility, value) de&mte{possibility, value) g&ieiic{possibility, 
value) } 

Heuristic rules for the number are given in the form: 
[condition for rule application) 

=^ { singu\a,r[possibility, value) phn:al{possibility, value) uncovLntalAe[possibility, 
value) } 
In condition for rule application, a surface expression is written in the form as 



in Figure 2.2. Possibility has value 1 when the categories: indefinite, definite, 
generic, singular, plural or uncountable, are possible in the context checked by 
the condition. Otherwise the possibility value is 0. Value means that a relative 
possibility value between 1 and 10 (integer) is given according to the plausibility 
of the condition that the possibility is 1 . A larger value means the plausibility is 
high. 

The rules are all heuristic so that the categories are not exclusive. In a certain 
conditional situation both indefinite and generic are possible, and also both sin- 
gular and plural can co-exist. In these cases, however, the possibility values may 
be different. 

Several rules can be applicable to a specific noun in a sentence. In this case 
the possibility values are added for individual categories and the final decision of 
a category for a noun is done by the maximum possibility value. An example is 



given in Section 2.4.1 



When determining the referential property and number of nouns, the condition 
part is matched not for a word sequence but for a dependency structure of a 
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sentence. The dependency structure of a sentence (Figure |2.l| (a)) is shown in 
Figure ^]^(b) which is represented as Figure |2.1| (c)P| to which the condition is 
checked. In heuristic rules, this expression can include a wild card(represented 
by "-") which can match any partial dependency structure representations. For 



example, a noun modified by "SONO(the)" is expressed as in Figure 2.2. There are 
many other expressions such as regular expressions, AND-, OR-, NOT-operators, 
MODee-operator for checking modifier-modifyee relation and so on. 



Algorithm of the Determination of a Category 

The following steps are taken for the decision of a category for the referential 
property and the number. 

(1) Sentences are transformed into dependency structure representations. 

(2) Decision is made for each noun from left to right in the sentences trans- 
formed into dependency structure representation. This process allows the 
decision process to make use of the referential property and the number 
already determined (see p.4.1| (c)(d) for example). For each noun, the refer- 
ential property is first determined, and then the number. This enables the 
utilization of referential property of a noun when analyzing the number of 
the noun (see p. 4. 2 (3) for example). In these processes all the applicable 



rules are used, possibility and value of each category are computed, and the 
category for the maximum value is obtained. An example of the result is 



shown in Figure 2.3. We can also utilize the global information of a doc- 
ument to which a sentence belongs in the decision process. The condition 
part, for example, can check whether there are previous identical nouns. 
This information is useful for the determination of the referential property. 



^ This is the result transformed by the system |Kurohashi & Nagao 94 1 
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( < [noun common-noun _ _ 'HITDRI' 'HITORI' indefinite singular] 
[be-verb _ be-verb DESU-line-basic-f orm 'DA' 'DESU'] 
[punctuation-mark period _ _ '. ' '. ']> 
( <[noun common-noun _ _ 'MUSUKD' 'MUSUKO' definite plural] 

[postpositional-particle noun-connection-postpositional-particle 
_ _ 'NO' 'N0']> 

( <[noun common-noun _ _ 'BENGOSHI' 'BENGOSHI' definite singular] 
[postpositional-particle 
noun-connection-postpositional-particle _ _ 'NO' 'ND']> 

( < [referential-pronominal 'SONO' 'SOND']> ))) 

( < [noun common-noun _ _ 'KARE' 'KARE' definite singular] 

[postpositional-particle sub-postpositional-particle _ _ 'WA' 
'WA'] 

[punctuation-mark komma __', ' ', ']>)) 

Figure 2.3: Tlie result of analyzing the sentence in Figure 2.1 



2.4 Heuristic Rules 

We have written 86 heuristic rules for the referential property and 48 heuristic 
rules for the number. More than half of these rules are just the implementation 
of grammatical properties explained in standard grammar books of Japanese and 
English I Kumayama 85|| Ikeuchi 85| ]| Koizumi 89[ , but there are many other heuris- 



tic rules which we have created. All of the rules are described in Appendix ^. 
Some of the rules are given below. 

2.4.1 Heuristic Rules for Referential Property 

(1) When a noun is modified by a referential pronoun, KONO(this), SONO(its), 
etc., 

then { indefinite (0, 0)0 definite (1, 2) generic (0, 0) } 
Examples: KONO fThis) HON-WA fbook) OMOSHIROI(interesting) 
This book is interesting. 

(2) When a noun is accompanied by a particle (WA), and the predicate is in 
the past tense, 

then { indefinite (1, 0) definite (1, 3) generic (1, 1) } 



(a, b) means the possibility{a) and the value{b). 
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Example: INU-WA fdog) MUKOUE(away there) IKIMASHITA(went) 
The dog went away. 

(3) When a noun is accompanied by a particle (WA), and the predicate is in 
the present tense, 

then { indefinite (1, 0) definite (1, 2) generic (1, 3) } 
Example: INU-WA YAKUNITATSUfuseful) DOUBUTSU(animal) DESU(is) 
Dog^ are useful animals. 

(4) When a noun is accompanied by a particle HE (to), MADE (up to) or KARA 
(from) , 

then { indefinite (1, 0) definite (1, 2) generic (1, 0) } 

Example: KARE-WO(he) KUUKOU-MADE fairport) MUKAE-NI(to meet) 

YUKIMASHOO(let us go) 

Let us go to meet him at the airport. 

(5) When a noun phrase is accompanied by a particle NO (of), and it modifies 
a noun phrase Q , 

{ indefinite (1, 0) definite (1, 2) generic (1, 3) } 

Example: KARE-WA(he) KYOUIKU-NO (education) KACHI- WO (value) 

NINSHIKI-SHITE-IMASEN(do not reahze) 

He doesn't realize the value of education . 

There are many other expressions which give some clues for the referential prop- 
erty of nouns, such as (i) the noun itself, "CHIKYUU (the earth)" [definite], 
"UCHUU (the universe)" [definite], etc., (ii) nouns modified by a numeral (Ex- 
ample: KORE-WA(this) ISSATSUNO(one) HON-DESU (book) [indefinite]. (This 
is a book .)), (iii) the same noun presented previously (Example: KARE-WA(he) 
JOUYOUSHA(car)-TO(and) TORAKKU- WO (truck) ICHIDAI-ZUTSU(by ones) 



■^ Both "a dog" and "the dog" are possible because of the generic subject. 

* When a noun phrase is accompanied by a particle NO (of), it is not always a generic noun 
phrase. But "NO" is likely to accompany old information, a noun phrase with "NO" is commonly 
a definite noun phrase or a generic noun phrase. Since we think that a definite noun phrase can 
be estimated by the other information, we give a generic noun phrase a higher point value in 
this rule. 
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MOTTEIMASUGA(have), JOUYOUSHA -NIDAKEfcarl [definite] HOKEN-WO- 
KAKETEIMASU(be insured). (He has a car and a truck, but only the car is in- 
sured.)), (iv) adverb phrases, "ITSUMO (always)", "NIHON-DEWA (in Japan)", 
etc. (Example: NIHON-DEWA SHASHOU-WA (conductor) [generic] JOUKYAKU 
(passenger)-NO(of) KIPPU- WO (ticket) SHIRABEMASU (check). (In Japan, 
the conductor checks the tickets of the passengers.)), (v) verbs, "SUKI(like)", 
"TANOSHIMU(enjoy)", etc. (Example: WATASHI-WA(I) RINGO-GA (apple) 
[generic] SUKI-DESU(nke). (I hke apples.)). 

In the case of no clues, "indefinite" is given to a noun as a default value. 

Since noun phrases which signify family relationships or body-parts such as 
"MUSUKO (son)" "ONAKA (stomach)" are almost always definite noun phrases, 
we had better use the rule that when a noun phrase is a family relationship or 
a body-part, it is judged to be a definite noun phrase. Since this rule was made 



after the experiment on the test sentences in Section 2.5, we did not use it in 
the experiment. To test the effectiveness of this rule we made the experiment 
using this rule. The result is that the accuracy percentage decreased by 0.4% 
in training sentences and increased by 3% in test sentences. This is because in 
training sentences there are unexpectedly many cases that a noun phrase which 
indicates a relative or a body-part is used as non-definite. In common sentences, 



we should use this rule. We used Bunrui Goi Hyou\ NLIll 64 \ in judging whether 
a noun phrase means kin or body-part. The noun phrase the prefix of whose bgh 
code is "121" is regarded as relative, and "157" is regarded as body-part. 

Let us see an example which has several rule applications for the determination 
of the referential property of a noun. "KUDAMONO (fruit)" in the following 
sentence is an example. 



WAREWARE-GA KINOU TSUMITOTTA KUDAMONO - WA AJI-GA IIDESU 
(We) (yesterday) (picked) (fruit) (taste) (be good) 

( The fruit that we picked yesterday tastes delicious.) 

(2.7) 
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Seven rules are applied for the determination of the definiteness of this noun. 
These are the following: 

(a) When a noun is accompanied by WA, and the corresponding predicate has 
no past tense 

(KUDAMONO-WA AZI-GA HDESU), 

then { indefinite (1, 0) definite (1, 2) generic (1, 3) } 

(b) When a noun is modified by an embedded sentence which is in the past 
tense (TSUMITOTTA), 

then { indefinite (1, 0) definite (1, 1) generic (1, 0) } 

(c) When a noun is modified by an embedded sentence which has a definite 
noun accompanied by WA or GA (WAREWARE-GA), 

then { indefinite (1, 0) definite (1, 1) generic (1, 0) } 

(d) When a noun is modified by an embedded sentence which has a definite 
noun accompanied by a particle (WAREWARE-GA), 

then { indefinite (1, 0) definite (1, 1) generic (1, 0) } 

(e) When a noun is modified by a phrase which has a pronoun (WAREWARE- 
GA), 

then { indefinite (1, 0) definite (1, 1) generic (1, 0) } 

(f) When a noun has an adjective as its predicate (KUDAMONO-WA AZI-GA 

IIDESU), 

then { indefinite (1, 0) definite (1, 3) generic (1, 4) } 

(g) When a noun is a common noun (KUDAMONO), 
then { indefinite (1, 1) definite (1, 0) generic (1, 0) } 

As the result of the application of all these rules, we obtained the final score 
of { indefinite (1, 1) definite (1, 9) generic (1, 7) } for KUDAMONO, and 
"definite" is given as the decision. 
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2.4.2 Heuristic Rules for Number 

(1) When a noun is modified by SONO(its), ANO(that), KONO(this), 
then { singular (1, 3) plural (1, 0) uncountable (1, 1) } 
Example: ANO(that) HON-WO (book) KUDASAI (give me) 

Give me that book . 

(2) When a noun is accompanied by a particle WA, GA, MO, WO, and there 
is a numeral x which modifies the predicate of a sentence, and 

if X = 1 , then { singular (1, 2) plural (1, 0) uncountable (1, 0) } 
if X > 2 , then { singular (1, 0) plural (1, 2) uncountable (1, 0) } 
Example: RINGO- WO (apple) NIKO(two) TABERU(eat) 
I eat two apples. 

(3) When a predicate, SUKI(like), TANOSHIMU (enjoy), etc. has a generic 
noun as an object, and the noun is accompanied by GA(for SUKI), or 
WO (for TANOSHIMU), 

then { singular (1, 0) plural (1, 2) uncountable (1, 0) } 
Example: WATASHI-WA(I) RINGO-GA fapple) SUKI-DESU(hke) 
I like apples. 

There are many other expressions which determine the number of a noun, 
such as (i) nouns modified by a numeral (Example: KORE-WA(this) ISSAT- 
SUNO(one) HON-DESU (book) [singular] . (This is a book .)), (ii) verbs such as 
ATSUMERU(collect), AFURERU(be fuh with), (Example: WATASHI-WA(I) 
NEKO-NO(about cat) HON-WO (book) [plural] ATSUMETEIMASU(collect). (I 
collect books on cats.)) (iii) adverbs such as NANDO-DEMO(as many times as 
...), IKURA-DEMO(as much ...) (Example: RIYUU-WA (reason) [plural] IKURA- 
DEMO(as much ...) SHIMESEMASU(give). (I can give you a number of rea- 
sons.)). 

In the case of no clues, "singular" is given as a default value. 
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2.5 Experiments and Results 

Experiments for the determination of the referential property and for the number 
were done in the following three texts: typical example sentences in a gram- 



mar book "Usage of the English Articles" [ Kumayama 85 1 , the complete text of 
a Japanese popular folk tale "The Old Man with a Lump"[Nakao_85|, a small 
fragment of an essay "TENSEI JINGO" . The rules were written by referring to 
these sentences which have good English translations. These sentences can be 



regarded as a training set. The results of the experiments are shown in Table 2.1 
Here "correct" means that the result was correct. "Reasonable" means that the 
result is given, for example, as non-generic but the correct answer was definite, 
etc. "Partially correct" means that the result was included in the correct answer. 
"Undecidable" means that we could not judge which category is correct. We ob- 
tained 85.5% success rate for the determination of the referential properties and 
89.0% success rate for the numbers for all these training sentences. The scores 
of these tables show that the heuristic rules are effective and applicable to these 
sentences. 

The modification and addition of rules in the experiment of training sentences 
were performed as follows: 

1. The modification and addition of rules were performed by examining errors. 
In other words, we looked at the surface expressions near a noun phrase 
which was incorrectly interpreted, and considered whether we can make a 
new rule. We also checked whether we could correct this error by modifying 
the condition and the point of the rule. 

2. After some modifications and additions of rules were performed, we checked 
whether the overall precision was higher or lower. When the overall precision 
was higher, we formally adopted the modifications and additions which were 
performed in ||. When the overall precision was lower, we did not perform 
the modifications and additions, and repeated examinations in |l|. 

In addition to this procedure, when we roughly examined some errors and 
found out a rule by which we could correct these errors, we added the rule to the 
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Table 2.1: Training sentences 





Referential property 


Number 


value 


indef 


def 


gener 


other 


total 


singl plural 


uncount 


other 


total 


Usage of the English Articles(140 sentences, 380 nouns) 


correct 


96 


184 


58 


1 


339 


274 


32 


18 


25 


349 


reasonable 





3 


1 





4 


1 


1 


1 





3 


partially correct 


























11 


11 


incorrect 


4 


25 


7 


1 


37 


3 


10 





4 


17 


% of correct 


96.0 


86.8 


87.9 


50.0 


89.2 


98.6 


74.4 


94.7 


62.5 


91.8 


The Old Man with a Lump(104 sentences, 267 nouns) 


correct 


73 


140 


6 


1 


222 


205 


24 


5 





234 


reasonable 


3 


4 








7 


2 











2 


partially correct 


























7 


7 


incorrect 


11 


23 


4 





38 


1 


22 


1 





24 


% of correct 


83.9 


84.0 


60.0 


100.0 


83.2 


98.7 


52.2 


83.3 


0.0 


87.6 


an essay "TENSEI JINGO" (23 sentences, 98 nouns) 


correct 


25 


35 


16 





76 


64 


13 





3 


80 


reasonable 





4 


2 





6 


2 


1 








3 


partially correct 


























6 


6 


incorrect 


5 


10 


1 





16 


1 


6 


1 


1 


9 


% of correct 


83.3 


71.4 


84.2 




77.6 


95.5 


65.0 


0.0 


30.0 


81.6 


average 






















% of appearance 


29.1 


57.7 


12.8 


0.4 


100.0 


74.2 


14.6 


3.5 


7.7 


100.0 


% of correct 


89.4 


84.0 


84.2 


66.7 


85.5 


98.2 


63.3 


88.5 


49.1 


89.0 



rule set. Moreover, when we were not certain whether we should add a certain 
rule, we listed all parts which were used by the rule and decided by looking at 
them as a whole. 

To test the quality of these rules, we applied them to the following three texts: 
a Japanese popular folk tale "TSURU NO ONGAESHI" [[Nakao 85|] , three smah 
fragments of an essay "TENSEI JINGO", "Pacific Asia in the Post-Cold- War 
World" (A Quarterly Publication of The International House of Japan Vol.12, 
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21 



Table 2.2: Test sentences 





Referential property 


Number 


value 


indef 


def 


gener 


other 


total 


singl 


plural 


uncount 


other 


total 


a folk tale "TSURU NO ONGAESHI" (263 sentences, 


699 nouns) 


correct 


109 


363 


13 


10 


495 


610 


13 


1 


1 


625 


reasonable 


6 


25 








31 


12 


2 








14 


partially correct 


























1 


1 


incorrect 


32 


135 


6 





173 


2 


20 


37 





59 


% of correct 


74.2 


69.4 


68.4 


100.0 


70.8 


97.8 


37.1 


2.6 


50.0 


89.4 


an essay "TENSEI JINGO" (75 sentences, 283 nouns) 


correct 


75 


81 


16 





172 


197 


13 


2 


3 


215 


reasonable 


8 


9 


1 





18 


3 


1 








4 


partially correct 


























3 


3 


incorrect 


33 


51 


9 





93 


3 


55 


3 





61 


% of correct 


64.7 


57.5 


61.5 




60.8 


97.0 


18.8 


40.0 


50.0 


76.0 


Pacific Asia in the Post-Cold- War World(22 sentences 


, 192 nouns) 


correct 


21 


108 


11 


2 


142 


157 


6 


1 


1 


165 


reasonable 


6 


7 








13 


3 











3 


partially correct 
































incorrect 


11 


24 


2 





37 


3 


20 


1 





24 


% of correct 


55.3 


77.7 


84.6 


100.0 


74.0 


96.3 


23.1 


50.0 


100.0 


85.9 


average 






















% of appearance 


25.6 


68.4 


4.9 


1.0 


100.0 


84.3 


11.1 


3.8 


0.8 


100.0 


% of correct 


68.1 


68.7 


69.0 


100.0 


68.9 


97.4 


24.6 


8.9 


55.6 


85.6 



No. 2 Spring 1992). These test sentences have good Enghsh translations. The 



results are shown in Table 2.2. The success rates for the referential property 
and the number decreased down to 68.9% and 85.6% respectively by these test 
sentences. These scores show, however, that the rules are still effective. 
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2.6 Discussion 

Discussion on the Experiment of the Referential Property 

With respect to referential property, the success rate was 85.5% in the training 
sentences by which we elaborated our rule set. There was no category which 
was very bad. This indicates that our method of using surface expressions can 
estimate the referential properties of many noun phrases. 

The success rate was 68.9% in the test sentences on which we fixed our rule set. 
All the categories' success rates were uniformly good and more than 60%. The 
appearance of the definite noun phrase was 74.8% in the experiment of "TSURU 
NO ONGAESHI". Therefore, if we make rules which handle each noun phrase 
as a definite noun phrase, the success rate becomes 74.8%, and becomes higher 
than the success rate of 70.8% in the experiment. But this is not good, because 
the success rates of indefinite noun phrases and generic noun phrases become 0%. 
We think that it is important that all the categories' success rates are uniformly 
good. 

The success rate in training sentences is not good. If we modify the rule set, 
the success rate will easily rise. But when we try to increase the success rate 
in new sentences, it may be necessary to continue to make new rules for new 
sentences. 



Table E^ and Table 2.4 are examples which are analyzed incorrectly, even if we 



modify the rule set. Table 2^ is a set of examples which are analyzed incorrectly 
because no key surface expression exists and a noun phrase is a definite noun 
phrase. To solve this problem, we need the information on contexts and situations. 

Table ^^ are examples which are analyzed incorrectly when a noun phrase is 
a generic noun phrase. We describe the reason for the error in each example. 

There were some cases where it is difficult to analyze using only surface ex- 
pressions. 

KORE-WA KARE-KARA KARITA JISHO DESU. 

(this) (from him) (borrow) (dictionary) (be) (2-8) 

(This is the dictionary that I borrowed from him. ) 

In this example, since "WATASHI-GA KARE-KARA KARITA JISHO (the 
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Table 2.3: Examples of definite noun phrases analyzed incorrectly (noun phrases 
whose head words are underlined) 



(1) KARE-WA SHACHOU -NO ONIISAN-DESU. 
(he) (president) (brother) 

(He is the brother of the president.) 

(2) JHON-WA KURASU -NO NAKADE ICHIBAN SEGATAKATAI. 
(John) (class) (in) (the most) (tall) 

(John is the tallest in my class.) 

(3) KANOJO-WA TEIBURU -WO HUKU-NONI HUKIN-WO TSUKATTA. 
(she) (table) (to dust) (cloth) (use) 

(She used a cloth to dust the table .) 

(4) SHIGOTO -DE MUZUKASHII-TOKOROGA ATTA-GA KOKUHUKUSHITA. 
(work) (difficulty) (exist) (overcome) 

(I overcame a difficulty in my work .) 

(5) WATASHI-WA SENSEI- TO ONAJI HON- WO MOTTE-IMASU. 
(I) (teacher) (same) (book) (have) 

(I have the same book as the teacher has.) 

(6) KURUMA-WA MICHI- NO-WAKINI CHUUSHA-SHITE-ARIMASU. 
(car) (along the street) (be parked) 

(Cars are parked along the street . ) 

(7) JONSONKYOUJU-WA GAKKAI -DE RONBUN -WO YOMIMASHITA. 
(Professor Johnson) (convention) (technical paper) (read) 

(Professor Johnson read his paper at the convention .) 



dictionary that I borrowed from him)" is modified by the embedded sentence, 
it was judged to be a definite noun phrase. But when "WATASHI (I)" bor- 
rowed some dictionaries from "KARE (him)" and "WATASHI-GA KARE-KARA 
KARITA JISHO (the dictionary that I borrowed from him)" is one of them, it is 
an indefinite noun phrase. Therefore it is difficult for the system to judge whether 
a noun phrase is a definite noun phrase or an indefinite noun phrase unless the 
system has certain information. 
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Table 2.4: Examples of generic noun phrases incorrectly analyzed (underlined 
noun phrases) 



(l)Wheii the noun phrase is incorrectly judged as definite because it is modified an 
embedded sentence 

SOREJITAI-WO MAMOROU-TO-SHINAI BUNKA-WA HOROBIMASU. 

(itself) (do not defend) (culture) (die) 

( A culture that does not defend itself will die. ) 

(2)Whcn the noun phrase is incorrectly judged as definite because the predicate is in 
the tense 

CHUUGOKUJIN-WA DOKUJI-NO MOJI-WO HATSUMEI-SHITA. 

(Chinese) (own) (writing system) (invent) 

( The Chinese invented their own writing system.) 

(3)When the noun phrase is incorrectly judged as indefinite because it is followed by 
a copula "DA" 

NIHON-NO SHAKAI-DEWA CHICHIOYA-WA KACHOU-DESU . 

(Japanese) (society) (father) (the head of the house hold) 

(In Japanese society, the father is the head of the household .) 

(4)When the noun phrase is incorrectly judged as indefinite because there is no clue 
TABEMONO-GA OISHIKEREBA OISHIIHODO, TAKUSAN TABEMASU. 
(food) (good) (the more) (much) (eat) 

(The better the food is, the more I eat.) 



Discussion on the Experiment of the Number 

The success rate was 89.0% in training sentences. But the success rate of "plural" 
was low. 

The success rate was 85.6% in test sentences. But the success rates of "plural" 
and "uncountable" were low. 

The following example is for when the plural noun phrase was analyzed incor- 
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Table 2.5: Examples of verbs which may be used in the estimation of the number 
of the noun phrase 



ABIRU (pour water), HUKIKAKERU (sprinkle), MABUSU (cover), WAKIDERU 
(well up), SOROERU (put in order), UMORERU (be buried), MORERU (leak), 
KOBORERU (drop, spill), MURAGARU (crowd), NOMU (drink) 



rectly. 

CHUUMON-SHITA KENCHIKU-ZAIRYOU- GA KIMASHITA. 

(order) (building material) (come) (2-9) 

(The building materials you ordered have come in.) 

The reason for the error is that there is no clue word. To judge this case to be 
"plural" , the system must judge it by the word "KENCHIKU-ZAIRYOU (building 
material)" itself. But "KENCHIKU-ZAIRYOU (building material)" is not always 
"plural". 

The following example is a plural noun phrase analyzed properly without 
quantifiers. 

SONO JIKO-NO-ATO YAJIUMA -GA ATSUMATTE-KIMASHITA. 

(after the accident) (people) (gather) (2.10) 

(People gathered after the accident) 

"YAJIUMA" was judged to be "plural" using the verb "GA ATSUMARU (gather)". 
If we make such a rule, we can occasionally analyze the number of a noun phrase 
which is not modified by a quantifier. 

After the experiment on the training sentences and test sentences, we exam- 
ined the rule using verbs such as "ATSUMARU (gather)", "NARABERU (put in 
order)", and "ABIRU (pour water)". We gathered about 300 verbs from "Bunrui 
Goi Hyou" | NLRI 6^ ] which can be used in the estimation of the number. The 



examples are shown in Table |2.5| . We also checked the occurrence of the noun 
phrases which can be analyzed properly by using these verbs. There were 21 
noun phrases in the sentences (526 sentences, 2680 noun phrases, essays of two 
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months) of essays "TENSEI JINGO" which were analyzed properly by the syn- 
tactic parser. This frequency was low. But since the number of the noun phrase 
which can be analyzed properly still increases, we must use the rule using verbs 
as in Table |2.5| for the estimation of the number. 



2.7 Summary of this Chapter 

We obtained the correct recognition scores of 85.5% and 89.0% in the estimation 
of referential property and number respectively for the sentences which were used 
for the construction of our rules. We tested these rules for some other texts, and 
obtained the scores of 68.9% and 85.6% respectively. 

There are two problems in the estimation of the referential property. One 
is that although a human can easily recognize the referential property from the 
situation, the system cannot estimate the referential property. If we can make use 
of situational information, we can analyze the problem properly. 

Another problem is with respect to generic noun phrases. A generic noun 
phrase is difficult to be defined to discriminate other categories. The category 
may have to be reconstructed. 

With respect to the number of a noun phrase, it is easily estimated, if it is 
modified by some surface expressions such as quantifiers. Since a noun phrase is 
not always modified by quantifiers, the estimation of the number is not so easy. 
There are some cases when the number is estimated properly by verbs such as 
"ATSUMERU (gather)" and adverbs such as "IKURADEMO (as much as one 
likes)". 



Chapter 3 

An Estimate of Referent of 
Noun Phrases 



3.1 Introduction 

This chapter describes how to estimate the referent of a noun phrase in Japanese 
sentences. It is important to clarify referents of noun phrases in machine transla- 
tion. For example, since the two "OJIISAN (old man)" in the following sentences 
have the same referent, the second "OJIISAN (old man)" should be pronominal- 
ized in translation into English. 

OJIISAN -WA JIMEN-NI KOSHI-WO-OROSHIMASHITA. 

(old man) (ground) (sit down) 

( The old man sat down on the ground.) 

(3.1) 
YAGATE OJIISAN -WA NEMUTTE-SHIMAIMASHITA. 

(soon) (old man) (fall asleep) 

(He (= the old man) soon fell asleep.) 

When dealing with a situation like this, it is necessary that a machine translation 
system should recognize that two "OJIISAN (old man)" have the same referents. 
In this chapter, we propose a method for determining referents of noun phrases 
using (l)referential properties of noun phrases, (2)modifiers in noun phrases, and 
(3)possessors of objects denoted by noun phrases. 

27 
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For languages that have articles like English, we can guess by using articles 
whether two noun phrases refer to each other or not. In contrast, for languages 
that have no articles like Japanese, it is difficult to decide whether two noun 
phrases refer to each other. We estimated referential properties of noun phrases 
that correspond to articles shown in Chapter |2|. By using these referential proper- 
ties, our system determines referents of noun phrases in Japanese sentences. Noun 
phrases are classified by referential property into generic noun phrases, definite 
noun phrases, and indefinite noun phrases. When the referential property of a 
noun phrase is a definite noun phrase, the noun phrase can refer to a noun phrase 
that has already appeared. When the referential property of a noun phrase is an 
indefinite noun phrase or a generic noun phrase, the noun phrase cannot refer to 
a noun phrase that has appeared already. 

It is insufficient to determine referents of noun phrases only using referential 
property. This is because even if the referential property of a noun phrase is a 
definite noun phrase, the noun phrase does not refer to a noun phrase which has a 
different modifier or a possessor. Therefore, we also use modifiers and possessors 
of noun phrases in determining referents of noun phrases. 

3.2 Referential Property of Noun Phrase 

The following is an example of noun phrase anaphora. 

OJIISAN TO OBAASAN-GA SUNDEORIMASHITA. 

(an old man) (and) (an old woman) (lived) 

(There lived an old man and an old woman.) 

(3.2) 

OJIISAN -WA YAMA-HE SHIBAKARI-NI IKIMASHITA. 

(old man) (mountain) (to gather firewood) (go) 

( The old man went to the mountains to gather firewood.) 

"OJIISAN (old man)" in the first sentence and "OJIISAN (old man)" in the 
second sentence refer to the same old man, and they are in anaphoric relation. 

When the system analyzes the anaphoric relation of noun phrases like this, 
the referential properties of noun phrases are important. Referential property of a 
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noun phrase here means how the noun phrase denotes the referent. Since the sec- 
ond "OJIISAN (old man)" has the referential property of the definite noun phrase, 
indicating that it refers to the contextually non-ambiguous object, the system can 
recognize that it refers to the first "OJIISAN (old man). The referential property 
plays an important role in clarifying anaphoric relation. 

We classified noun phrases by referential property into the following three 
types as shown in Chapter |2[ 

generic noun phrase 

noun phrase I -if definite noun phrase 

' non generic noun phrase < . , o • , i 

1 indennite noun phrase 



Generic noun phrase A noun phrase is classified as generic when it denotes 
all members of the class of the noun phrase or the class itself of the noun phrase. 
For example, "INU(dog)" in the following sentence is a generic noun phrase. 

INU-WA YAKUNI-TACHIMASU. 

(dog) (useful) (3.3) 

( Dogs are useful.) 

A generic noun phrase cannot refer to an indefinite/definite noun phrase. Two 
generic noun phrases can refer to each other. 

Definite noun phrase A noun phrase is classified as definite when it denotes a 
contextually non-ambiguous member of the class of the noun phrase. For example, 
"INU(dog)" in the following sentence is a definite noun phrase. 

INU-WA MUKOUHE IKIMASHITA. 

(dog) (away) (go) (3.4) 

(The dog went away.) 

A definite noun phrase can refer to a noun phrase that has already appeared. 

Indefinite noun phrase An indefinite noun phrase denotes an arbitrary mem- 
ber of the class of the noun phrase. For example, the following "INU(dog)" is an 
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indefinite noun phrase. 

INU-GA SANBIKI IMASU. 

(dog) (three) (there is) (3.5) 

(There are three dogs.) 

An indefinite noun phrase cannot refer to a noun phrase that has aheady appeared. 

3.3 How to Estimate Referent of Noun Phrase 

To determine referents of noun phrases, we made the following three constraints. 

1. Referential property constraint 

2. Modifier constraint 

3. Possessor constraint 

When two noun phrases which have the same head noun satisfy these three con- 
straints, the system judges that the two noun phrases refer to each other. These 
three constraints are as follows: 

3.3.1 Referential Property Constraint 

First, our system estimates the referential property of a noun phrase using the 
method in Chapter 0. The method estimates a referential property using surface 
expressions in the sentences. For example, since the second "OJIISAN (old man)" 
in the following sentences is accompanied by a particle "WA (topic)", and the 
predicate is in the past tense, it is estimated to be a definite noun phrase. 

OJIISAN -WA JIMEN-NI KOSHI-WO-OROSHIMASHITA. 

(old man) (ground) (sit down) 

( The old man sat down on the ground.) 

(3.6) 
YAGATE OJIISAN -WA NEMUTTE-SHIMAIMASHITA. 

(soon) (old man) (fall asleep) 

( He soon fell asleep.) 



3.3. HOW TO ESTIMATE REFERENT OF NOUN PHRASE 31 

Next, our system determines the referent of a noun phrase using its estimated 
referential property. When a noun phrase is estimated to be a definite noun 
phrase, our system judges that the noun phrase refers to a previous noun phrase 
which has the same head noun. For example, the second "OJIISAN" in the above 
sentences is estimated to be a definite noun phrase, and our system judges that 
it refers to the first "OJIISAN" . 

When a noun phrase is not estimated to be a definite noun phrase, the noun 
phrase can refer to a noun phrase that has already been mentioned, because 
estimating the referential property may fail. Therefore, when a noun phrase is 
not estimated to be a definite noun phrase, our system gets a possible referent of 
the noun phrase from topic and focus, and determines the referent of the noun 
phrase using the following three kinds of information. 

• the plausibility of the estimated referential property that is a definite noun 
phrase 

• the weight of a possible referent in the case of topic or focus 

• the distance between the estimated noun phrase and a possible referent 

3.3.2 Modifier Constraint 

It is insufficient to determine referents of noun phrases by only using referential 
property. When two noun phrases have different modifiers, they commonly do 
not refer to each other. For example, "HIDARI(left)-NO HOO(cheek)" in the 
following sentences do not refer to "MIGI(right)-NO HOO(cheek)". 

KONO OJIISAN-NO KOBU-WA MIGI-NO HOO -NI ARIMASHITA. 

(this) (old man) (lump) (right) (check) (be on) 

(This old man's lump was on his right check.) 

(3.7) 
TENGU-WA, KOBU-WO HIDARI-NO HOO -NI TSUKETE-SHIMAIMASHITA. 

(tengu)-^ (lump) (left) (check) (put on) 

(The " tengu" put a lump on his left cheek ) 
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Therefore, we made the following constraint: When a noun phrase has a modifier, 
it cannot refer to a noun phrase that does not have the same modifier. When a 
noun phrase does not have a modifier, it can refer to a noun phrase that has any 
modifier. 

3.3.3 Possessor Constraint 

When a noun phrase has a semantic marker PAR (a part of a body) R, our system 
tries to estimate the possessor of the object denoted by the noun phrase. We 
suppose that the possessor of a noun phrase is the subject or the noun phrase's 
nearest topic that has a semantic marker HUM (human) or a semantic marker AN I 
(animal). For example, the possessor of the first "HOO (cheek)" in the following 
sentences is estimated to be "OJIISAN (old man)" because "OJIISAN (old man)" 
is followed by a particle "NIWA" , is the topic in the sentence, and has a semantic 
marker HUM (human). 

OJIISAN-NIWA [OJIISAN-NO]|hIDARI-NO HOO-NI KOBU-GA ARIMASHITA. 

(old man) (old man's) (left) (cheek) (lump) (be on) 

(This old man had a lump on his left cheek .) 

SORE-WA HITO-NO KOBUSHI-HODOMO-ARU KOBU-DESHITA. 

(it) (person) (fist) (lump) 

(It is about the size of a person's fist.) 

[OJIISAN-NO] HOO-WO HUKURAMASETE- IRUYOUNI MIERUNODESHITA. 
(old man) (cheek) (puff) (look) 

(He looked as if he had puffed out his cheek .) 

The possessor of the second "HOO (cheek)" is also estimated to be "OJIISAN 
(old man)" because "OJIISAN (old man)" is the subject in the sentence Q 

We made the following constraint by using possessors. When the possessor of 



^ A tengu is a kind of monster. 

^ In this thesis, we use Noun Semantic Marker Dictionary Watanabe et al 92] as a semantic 



marker dictionary. 

^ The words in brackets [ ] are omitted in the sentences. 

* Omitted subjects are estimated by the method in Chapter BI 
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a noun phrase is estimated, the noun phrase cannot refer to a noun phrase that 
does not have the same possessor. When the possessor of a noun phrase is not 
estimated, the noun phrase can refer to a noun phrase that has any possessor. 

For example, since the two "HOO (cheek)" in the above sentences have the 
same possessor "O JUS AN (old man)", our system correctly judges that the two 
"HOO (cheek)" have the same referent. 

3.4 Anaphora Resolution System 

3.4.1 Procedure 

Before determining referents, sentences are transformed into a case structure by 



the case structure analyzer [ Kurohashi fc Nagao 94 



Referents of noun phrases are determined by heuristic rules which are made 
from such information as the three constraints mentioned in Section ^^. Using 
these rules, our system takes possible referents and gives them points. It judges 
that the candidate having the maximum total score is the referent. This is because 
a number of types of information is combined in anaphora resolution. We can 
specify which rule takes priority by using points. 

The heuristic rules are given in the following form. 

Condition =^ { Proposal Proposal .. } 
Proposal := ( Possible-Referent Point ) 

In Condition, surface expressions, semantic constraints, referential properties, etc. 
are written as conditions. In Possible-Referent, a possible referent, "indefinite", or 
other things are written, "indefinite" means that the noun phase is an indefinite 
noun phrase, and it does not refer to a previous noun phrase. Point means the 
plausibility value of the possible referent. 

3.4.2 Heuristic Rule for Estimating Referents 

We made 8 heuristic rules for noun phrase anaphora resolution. All the rules are 
given below. 
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Rl When a noun phrase is like "IK A (the fohowing)", 
{(Next sentences, 50)} f\ 

R2 When a noun phrase is modified by the words "SOREZORE-NO (each)" 
and "ONOONO-NO (each)", 
{(Indefinite, 25)} 

R3 When a noun phrase is the word "JIBUN (oneself)", 
{(The subject in the sentence, 25)} 

R4 When a noun phrase is estimated to be a definite noun phrase, and satisfies 
modifier constraint and possessor constraint, and the same noun phrase X 
has already appeared, 
{(The noun phrase X, 30)} 

R5 When a noun phrase is estimated to be a generic noun phrase, 
{(Generic, 10)} 

R6 When a noun phrase is estimated to be an indefinite noun phrase, 
{(Indefinite, 10)} 

R7 When a noun phrase is like "ISSHO (together)" and "HONTOU (true)", 
which is used as an adverb or an adjective, 

{(No referent, 30)} 

(Ex.) TENGU-TACHI-WA ISSHO (together)- NI WARM DASHIMASHITA. 

(tengu) (together) (laugh) (begin) 

(The tengu began laughing together. ) 

R8 When a noun phrase X is not estimated to be a definite noun phrase, 

{ (A noun phrase X which satisfies modifier constraint and possessor con- 
straint, W - D + P + A)} 

The values W, D, P are defined as follows: The definition and the weight 



(W) of topic and focus are given in Table 3.1 and Table |3.2| respectively. (In 



^(a,b) means candidate(a) and point(b). 
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Table 3.1: The weight in the case of topic 



Surface Expression 


Example 


Weight 


Pronoun/Zero-Pronoun GA/WA 


fJohnGA fsubiect))SHfTA (done). 


21 


Noun WA/NIWA 


JohnWA fsubiectlSHITA fdoV 


20 



Table 3.2: The weight in the case of focus 



Surface Expression 


Example 


Weight 


Pronoun/Zero-Pronoun 
WO(object)/NI(to) 
/KARA (from) 


(JohnNl (to))SHITA (done). 


16 


Noun GA 
(subject)/MO/DA/NARA 


JohnGA (subject)SHITA (do). 


15 


Noun WO (object)/NI/, /. 


JohnNl (obiect)SHITA (do). 


14 


Noun HE (to)/DE (in) /KARA 
(from) 


GAKKOU (school)HE (to)IKU (go). 


13 



this work, a topic is defined as a theme which is described, and a focus is 
defined as a word which is stressed by the speaker (or the writer). But we 
cannot detect topics and foci correctly. Therefore we approximated them 
by Table |0| and Table |3.2| .) When a possible referent is a topic, the dis- 
tance (D) between the estimated noun phrase and the possible referent is 
the number of topics between them. When a possible referent is a focus, the 
distance (D) is the number of foci between them. The plausibility (P) that 



the referential property is a definite is given in Table |3.3[ In the table "Dif- 
ference score between definite and other referential property" is determined 
as follows. When the method in Chapter estimates a referential property, 
it gives each category of referential property some points, and it outputs the 
score of each category. From these scores our system calculates "Difference 
score between definite and other referential property". These values were 



determined by hand on training sentences mentioned in Section 3.5.1. 
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Table 3.3: The plausibility(P) that the referential property is definite 



Difference score between definite and other referential property 





1 


2 


3~ 


The plausibility P 





-3 


-6 


— CX3 



3.4.3 Example of Estimating the Referent of a Noun Phrase 



An example of determining the referent of a noun phrase is shown in Figure 3.1 
This figure shows that the underlined "HI (fire)" in the figure was interpreted 
properly. The process is as follows: 

At first, our system estimated the referential property of the underlined "HI 
(fire)". The referential property was incorrectly estimated to be a generic noun 
phrase as shown in the table "Estimate of referential property" in the figure. Since 
the estimated referential property was a generic noun phrase, the rule R5 proposed 
a possible referent "Generic", and gave it 10 points. Also, the rule R8, which 
applies when the estimated referential property is incorrect, proposed a possible 
referent "HI (fire)" in the previous sentence. Since it does not have a modifier and 
a possessor, it satisfied modifier constraint and possessor constraint. It was given 
a value of the evaluation function W — D + P in referential property constraint. 



The weight W was given 15 by Table 3.2 because it is followed by a particle "GA 



(subject)". The distance D was given 4 because there are four foci "OTOKO 
(man)", "KAO (face)", "KI (notice)" and <"HI (fire) in the previous sentence> 
between the underlined "HI (fire)" and <"HI (fire) in the previous sentence>. 
Since the difference score between definite and other referential property was 1 
(= 3(generic) — 2 (indefinite)), the plausibility (P) was given —3 by Table |3.3| . 
Therefore, the evaluation function W^ - D + P + 4 is 12 (= 15 - 4 - 3 + 4). "HI 
(fire)" in the previous sentence was 12. Since the value 12 of "HI (fire)" was 
higher than the value 10 of "Generic" , our system judged that the underlined "HI 
(fire)" refers to the "HI (fire)" in the previous sentence correctly. As the result, 
the referential property of the underlined "HI (fire)" was judged to be a definite 
noun phrase correctly. 
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OJIISAN-WA AKICHI-NI HI-GA MOETEIRU-NONI KIGA-TSUKIMASHITA. 

(old man) (open space) (fire) (burn) (notice) 

(The old man noticed that there was a big bright fire burning in an open space.) 

AKAI KAO-WO-SHITA OTOKO-TACHI-GA, HI-NO MAWARI-NI 

(red) (face) (man) (fire) (around) 

TATTEIRU-NOWO MIMASHITA. 

(stand) (see) 

(He saw some men with red faces standing around the fire .) 



Satisfied Rule 


Score 




Generic 


"HI (fire)" 

in the previous sentence 


Rule 5 
Rules 


10 


12 


Total Score 


10 


12 



Estimate of referential property 



Referential property 


Indefinite 


Definite 


Generic 


Point 


1 


2 


3 



''HI (fire)" in the previous sentence has the following score. 
VF-D + P^ 15-4-3 + 4= 12 

Figure 3.1: Example of estimating the referent of a noun phrase 



3.5 Experiment and Discussion 



3.5.1 Experiment 

Before estimating the referents of noun phrases, sentences were at first transformed 
into a case structure by the case structure analyzer |Kurohashi & Nagao 9^^]. The 
errors made by the case analyzer were corrected by hand. We show the result of 



estimating the referents of noun phrases in Table 3.4 
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Table 3.4: Result 





Recall 


Precision 


Training sentences 
Test sentences 


82% (130/159) 
79% ( 89/113) 


85% (130/153) 
77% ( 89/115) 



Training sentences {example sentences (43 sentences), a fork tale "KOBU- 
TORI JIISAN" jNakao 85|1 (93 sentences), an essay in "TENSEIJINGO" (26 
sentences), an editorial (26 sentences), an article in "Scientific American (in 
Japanese)" (16 sentences)} 

Test sentences {a fork tale "TSURU NO ONGAESHI" jNakao 8E|] (91 sen- 
tences), two essays in "TENSEIJINGO" (50 sentences), an editorial (30 sen- 
tences), articles in "Scientific American(in Japanese)" (13 sentences)} 



To verify that the three constraints (referential proper, modifier, and possessor 
constraint) are effective, we experimented with the changed condition and com- 



pared them. The results are shown in Table 3.5. The upper row and the lower 



row of this table show precision and recall respectively. Precision is the fraction 
of noun phrases which were judged to have the antecedents. Recall is the fraction 
of noun phrases which have the antecedents. 

In these experiments we used training sentences and test sentences. The train- 



ing sentences were used to make the heuristic rules in Section 3.4.2 by hand. The 
test sentences were used to verify the effectiveness of these rules. 

In Table ^^, Method 1 "Only when it is estimated to be definite can it refer 
to another noun phrase" is a case when a noun phrase can refer to a noun phrase, 
only when the estimated referential property is a definite noun phrase, where 
modifier constraint and possessor constraint are used. Method 2 "The method 
of this work" is the method mentioned in Section |3.3| , which uses all three con- 
straints. Method 3 "No use of referential property" is a method without referential 
property, which uses only such information as distance, topic-focus, modifier, and 
possessor. Method 4 "No use of modifier constraint and possessor constraint" is 
a method without modifier constraint and possessor constraint. Method 5 "The 
same two nouns co-refer" is a case that a noun phrase always refers to a noun 
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Table 3.5: Comparison 






Method 1 


Method 2 


Method 3 


Method 4 


Method 5 




Training sentences 


92%(117/127) 82%(130/159) 72%(123/170) 
76%(117/153) 85%(130/153) 80%(123/153) 


65%(138/213) 52%(134/260) 
90%(138/153) 88%(134/153) 


Test sentences 


92% ( 78/ 85) 
68% ( 78/115) 


79% ( 89/113) 
77% ( 89/115) 


69% ( 79/114) 
69% ( 79/115) 


58% ( 92/159) 
80% ( 92/115) 


47%(102/218) 
89%(102/115) 


Method 1 : Only when it is estimated to be definite can it refer to another noun 

phrase 

Method 2 : The method of this work 

Method 3 : No use of referential property 

Method 4 : No use of modifier constraint and possessor constraint 

Method 5 : The same two nouns co-refer 





phrase that has the same head noun. 

The table shows many results. In Method 2 "The method of this work" , both 
the recall and the precision were high. This indicates that the referential property 
was used properly in the method that is described in this chapter. Method 2 "The 
method of this work" was higher than Method 3 "No use of referential property" 
in both recall and precision. This indicates that the information of referential 
property is necessary. In Method 1 "Only when it is estimated to be definite 
can it refer to another noun phrase", the recall was low. The reason is because 
there were many noun phrases that are definite but were estimated to be indefi- 
nite/generic, and the system estimated that the noun phrases cannot refer to noun 
phrases. In Method 4 "No use of modifier constraint and possessor constraint" , 
the precision was low. Since modifier constraint and possessor constraint were not 
used, and there were many pairs of two noun phrases that do not co-refer, such 
as "HIDARI(left)-NO HOO(cheek)" and "MIGI(right)-NO HOO(cheek)", these 
pairs were incorrectly interpreted as co-reference. This indicates that it is neces- 
sary to use modifier constraint and possessor constraint. In Method 5 "The same 
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two nouns co-refer", the precision was lower than in Method 4. This is because 
referential properties were not used and the system judged that a noun phrase 
which is not a definite noun phrase refers to another noun phrase. 

3.5.2 Examples of Errors 

We found that it was necessary to use modifiers and possessors through the above 
experiments. But since the possessor of a noun was estimated incorrectly, the 
referent was also estimated incorrectly as follows. 



OJIISAN-WA (OJIISAN-NO) SENAKA-KARA SHIBA-WO OROSHIMASHITA. 

(old man) (old man's) (back) (firewood) (take down) 

(He took down the bundle of firewood from his baek .) 

(an omission of a middle part) 

OJIISAN-WA OTOKOTACHI-WO NINGEN-DATO OMOTTEIMASHITAGA, 
(old man) (man) (human beings) (think) 

(The old man thought they were human beings, ) 

MAMONAKU TENGU-DEARU-KOTO-GA WAKARIMASHITA. 

(soon) (tengu) (realize) 

(but soon he realized that they were "tengu," or supernatural beings.) 

[TENGU-NO] SENAKA-NIWA OOKINA TSUBASA-GA ARUNODESU. 
(tengu) (back) (large) (wing) (have on) 

(They had large wings on their backs .) 

Since the underlined "SENAKA (back)" in this example is a part of an animal, 
the possessor is estimated. Although the proper possessor is "TENGU (tengu)", 
the system estimated incorrectly that the possessor was "OJJISAN (old man)" 
that is a topic of the previous sentence. For this reason, our system judged that 
this "SENAKA (back)" refers to the twice underlined "[OJJISAN-NO] SENAKA 
(the old man's back)" incorrectly. 

Sometimes a noun can refer to a noun that has a different modifier. In such a 
case, the system made an incorrect judgment. 
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OJIISAN-WA CHIKAKU-NO OOKINA SUGI-NO KI-NO NEMOTO-NI ARU 

(old man) (near) (huge) (cedar) (tree) (base) (be at) 

ANA -DE AMAYADORI-WO SURU-KOTO-NI-SHIMASHITA. 

(hole) (take shelter from the rain) (decide to do) 

(So, he decided to take shelter from the rain in a hole which is at the base of 

a huge cedar tree nearby.) 

(an omission of a middle part) 

TSUGI-NO-HI, KONO OJIISAN-WA YAMA-HE ITTE, 
(next day) (this) (old man) (mountain) (go to) 

(The next day, this man went to the mountain, ) 

SUGI-NO KI-NO NEMOTO-NO ANA- WO MITSUKEMASHITA. 
(cedar) (tree) (at base) (hole) (found) 

(and found the hole at the base of the cedar tree .) 

The two "ANA (hole)" in this sentence refer to each other. But our system judged 
that the two "ANA (hole)" in these sentences do not refer to each other because 
the modifiers of the two "ANA (hole)" are different. In order to correctly analyze 
this case, it is necessary to decide whether two different expressions are equal in 
meaning. 

3.6 Summary 

This chapter described the method of how to estimate the referents of noun phrases 
using the referential properties, the modifiers, and the possessors. As a result of 
using this method, we obtained a precision rate of 82% and a recall rate of 85% 
in the estimation of referents of noun phrases that have antecedents on training 
sentences, and obtained a precision rate of 79% and a recall rate of 77% on test 
sentences. We verified that it is effective to use referential properties, modifiers, 
and possessors of noun phrases. 



Chapter 4 

Indirect Anaphora Resolution 
in Noun Phrases 



4.1 Introduction 

Chapter |3| described the case when a noun phrase refers to an entity that has 
already been mentioned. Chapter ^ describes the case when a noun phrase refers 
to an entity that has not been mentioned yet, but an entity associated with 
an entity that has already been mentioned. For example, "/ went into an old 
house last night. The roof was leaking badly and ..." indicates that "T/ie roof is 
associated with ^^an old house" , which has already been mentioned. This kind of 
reference (indirect anaphora) has not been thoroughly studied in natural language 
processing^, but is important for coherence resolution, language understanding, 
and machine translation. We propose a method to resolve indirect anaphora in 
Japanese nouns using the relationships between two nouns. 

When we analyze indirect anaphora, we need a case frame dictionary for nouns 
containing an information about relations between two nouns. For example, in 
the case of the above example, the knowledge that "roof" is a part of "house" is 
required to analyze the indirect anaphora. But no such noun case frame dictionary 



^ [Nagao et al 76[ made the investigation of resolving indirect anaphora in some nouns such 



as "TAISEKI (volume)" in sentences on chemistry. But there is no research resolving indirect 
anaphora in all the nouns. 
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exists at present. We considered whether we can use the example-based method 
to solve this problem. In this case, the knowledge that "roof" is a part of "house" 
is analogous to "house of roof". Therefore we use examples of the form "X of 
Y" instead. In the above example, we use a linguistic data such as "the roof of a 
house" . In the case of verbal nouns, we do not use "X of Y" but a verb case frame 
dictionary. This is because a noun case frame is similar to a verb case frame and 
a verb case frame dictionary exists at present. 

The next section describes a method of resolving indirect anaphora. 

4.2 How to Resolve Indirect Anaphora 

An anaphor and the antecedent in an indirect anaphora have a certain relation. 
For example, "YANE (roof)" and "HURUI IE (old house)" are in an indirect 
anaphoric relation which is a part-of relation. 

SAKUBAN ARU HURUI lE-NI ITTA. 

(last night) (a certain) (old) (house) (go) 

(I went into an old house last night.) 

(4.1) 
YANE-WA HIDOI AMAMORIDE ... 

(roof) (badly) (be leaking) 

( The roof was leaking badly and ... ) 

When we analyze the indirect anaphora, we need a dictionary containing infor- 
mation about relations between anaphors and their antecedents. 

We show examples of the relations between an anaphor and the antecedent 



in Table |4.l| . The form of Table ^T| is similar to the form of a verb case frame 
dictionary. We call a dictionary containing the relations between two nouns a 
noun case frame dictionary. But no noun case frame dictionary has been created 
so far. Therefore, we substitute it by examples of "X NO Y (Y of X)" and by a 
verb case frame dictionary. "X NO Y" is a Japanese expression. It means "Y of 
X", "YinX", "Yfor X", etc. 

Resolution of indirect anaphora is done by the following steps. 
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Table 4.1: Example of noun case frame dictionary 



Anaphor 


Things which can be the Antecedent 


Relation 


KAZOKU (family) 


HITO (human) 


belong 


KOKUMIN (nation) 


KUNI (country) 


belong 


GENSHU (the head of state) 


KUNI (country) 


belong 


YANE (roof) 


TATEMONO (building) 


part of 


MOKEI (model) 


SEISANBUTSU (product) 

(ex. HIKOUKI (air plain), HUNE (ship)) 


object 


GYOUJI (event) 


SOSHIKI (organization) 


agent 


JINKAKU (personahty) 


HITO (human) 


possessive 


KYOUIKU (education) 


HITO (human) 

HITO (human) 

NOURYOKU (abihty) 

(ex. SUUGAKU (mathematics)) 


agent 

recipient 

object 


KENKYUU (research) 


HITO (human), SOSHIKI (organization) 
GAKUMON BUN'YA (field of study) 


agent 
object 



Table 4.2: Case frame of verb "KUICHIGAU (differ)" 



Surface Case 


Semantic Marker 


Examples 


Ga-case (subject) 
To-case (object) 


abstract 
abstract 


DEETA (data), IKEN (opinion) 
DEETA (data), MIKATA (viewpoint) 



1. We detect some elements which will be analyzed in indirect anaphora res- 
olution using "X NO Y" and a verb case frame dictionary. When a noun 
is a verbal noun, we use a verb case frame dictionary. Otherwise, we use 
examples: "X NO Y". For example, "KUICHIGAI (difference)" is a verbal 
noun, and we use a case frame of a verb "KUICHIGAU (differ)" for the 
indirect anaphora resolution of "KUICHIGAI (difference)." The case frame 



is shown in Table 4.2. In this table there are two case components, GA-case 



(subject) and TO-case (object). These two case components are elements 



4.2. HOW TO RESOLVE INDIRECT ANAPHORA 45 

which win be analyzed in indirect anaphora resolution. 

Tom-WA DEETA-WO KONPYUUTA-NI UCHIKONDE-IMASHITA. 

(Tom) (data) (computer) (store) 

(Tom was storing the data in a computer.) 

YATTO HANBUN YARIOEMASHITA. 

(Finally) (half) (finish) 

(Finally he was half finished. ) 

John-GA HURUI DEETA-WO MISEMASHITA. 
(John) (old) (data) (show) 

(John showed him some old data.) 

IKUTSUKA-NO KUICHIGAI- WO SETSUMEISHITE-KURE-MASHITA. 

(several) (difference) (explain) 

(Tom did John a favor of explaining several differences . ) 

(4.2) 

2. We take possible antecedents from topics or foci in previous sentences. We 
give them some weight of topics and foci which means the plausibility of the 
antecedent because topics and foci have various plausibilities. 

3. We determine the antecedent by combining the weight of topics and foci in 
^, the weight of semantic similarity in "X NO Y" or a verb case frame dic- 
tionary, and the weight of the distance between an anaphor and its possible 
antecedent. 

For example, when we want to clarify the antecedent of YANE (roof) in the 
sentences ( [4.1D , we gather examples of "<noun X> NO YANE (roof)" (roof of 
<noun X>), and select a possible noun which is semantically similar to <noun X> 
as its antecedent. Also, when we want to have an antecedent of "KUICHIGAI 
(difference)" in the sentences ( |4.2| ), we select a possible noun which satisfies the 



semantic marker in the case frame of "KUICHIGAU (differ)" in Table 4.2 or is 

semantically similar to examples of components in the case frame as its antecedent. 

We think that errors made by the substitution of a verb case frame for a noun 

case frame are rare, but many errors will happen when we substitute "X NO Y" 
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for a noun case frame. This is because "X NO Y (Y of X)" has many semantic 
relations, in particular a feature relation (ex. a man of ability), which cannot 
be an indirect anaphoric relation. To reduce the errors, we use the following 
procedure. 

1. We do not use an example of the form "noun X NO noun Y (Y of X)," when 
the noun X is an adjective noun (ex. HONTOU (reality)), a numeral, or a 
temporal noun. For example, we do not use "HONTOU (reality) NO (of) 
HANNIN (criminal) (a real criminal)". 

2. We do not use an example of the form "noun X NO noun Y (Y of X)," 
when the noun Y is a noun that cannot be an anaphor of indirect anaphora. 
For example, we do not use "noun X NO TSURU (crane)", "noun X NO 
NINGEN (human being)." 

We cannot completely avoid the errors by introducing the above procedure, but 
we can reduce the errors to a certain extent. 

We need some more consideration for nouns such as "ICHIBU (part)", 
"TONARI (neighbor)" and "BETSU (other)." When such a noun is a case com- 
ponent of a verb, we use information on semantic constraint of the verb. We use 
a verb case frame dictionary. 

TAKUSAN-NO KURUMA-GA KOUEN-NI TOMATTE-ITA. 

(many) (car) (in the park) (there were) 

(There were many cars in the park.) 

(4.3) 
ICHIBU -WA KITANI MUKATTA 

(A part (of them)) (to the north) (went) 

(A part of them went to the north.) 

In this example, since "ICHIBU (part)" is a GA case (subject) of a verb "MUKAU 
(go)," we consult the GA case (subject) of the case frame of "MUKAU (go)." Some 
noun phrases which can be filled in the case component are written in the GA 
case (subject) of the case frame. In this case, "KARE (he)" and "HUNE (ship)" 
are written as examples of things which can be filled in the case component. 
This indicates that the antecedent is semantically similar to "KARE (he)" and 
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"HUNE (ship)." Since "TAKUSAN NO KURUMA (many cars)" is semantically 
similar to "HUNE (ship)" in the meaning of vehicles, it is judged to be the proper 
antecedent. 

When such a noun as "TONARI (neighbor or next)" modifies a noun X as 
"TONARI NO X", we think that the antecedent is a noun which is similar to 
noun X in meaning. 

OJIISAN-WA OOYOROKOBI-WO-SHITE lE-NI KAERIMASHITA. 
(the old man) (in great joy) (house) (returned) 

(The old man returned home (house) in great joy,) 

OKOTTA KOTOWO HITOBITONI HANASHIMASHITA 

(had happened to him) (all things) (everybody) (told) (4-4) 

(and told everybody all that had happened to him.) 

TONARI- NO lE-NI OJIISAN-GA MOUHITORI SUNDE-ORIMASHITA. 

(next) (house) (old man) (another) (live) 

(There lived in the next house another old man. ) 

For example, when "TONARI (neighbor or next)" modifies "IE (house)," we judge 
that the antecedent of "TONARI (neighbor or next)" is "IE (house)" in the first 
sentence. 

4.3 Anaphora Resolution System 

4.3.1 Procedure 

Analysis of indirect anaphora is performed in the same framework of Chapter 
y. At first, sentences are transformed into a case structure by the case structure 
analyzer I Kurohashi & Nagao 94 1. Next, antecedents in indirect anaphora are de- 



termined by heuristic rules for each noun from left to right. Using these rules, 
our system takes possible referents and gives them points. It judges that the 
candidate having the maximum total score is the desired antecedent. 
The heuristic rules are given in the following form. 

Condition => { Proposal, Proposal, .. } 
Proposal := ( Possible- Antecedent, Point ) 
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Table 4.3: The weight (W) in the case of topic 



Surface Expression 


Example 


Weight 


Pronoun/Zero-Pronoun GA/WA 


(JohnGA ('subiect))SHITA (done). 


21 


Noun WA/NIWA 


JolmWA fsubiect)SHITA fdo). 


20 



Table 4.4: The weight (W) in the case of focus 



Surface Expression 


Example 


Weight 


Pronoun/Zero-Pronoun 

WO(object)/NI(to) 
/KARA (from) 


(JohnNI (to))SHITA (done). 


16 


Noun GA 
(subject)/MO/DA/NARA 


JohnGA (subject)SHITA (do). 


15 


Noun WO (object)/NI/, /. 


JohnNI (obiect)SHITA (do). 


14 


Noun HE (to)/DE (in) /KARA 
(from) 


GAKKOU (school)HE (to)IKU (go). 


13 



Surface expressions, semantic constraints, referential properties, and so on, are 
written as conditions in Condition part. A possible antecedent is written in 
Possible-Antecedent part. Point means the plausibility of the possible antecedent. 



4.3.2 Heuristic Rule for Estimating Antecedents 

Resolution of indirect anaphora is performed by adding the rules for indirect 
anaphora resolution to the rules for direct anaphora resolution. We wrote 12 
heuristic rules for noun phrase anaphora resolution in Chapter y. The rules (from 



Rl to R8) for noun phrase direct anaphora are shown in Section 3.4.2. The rules 
for noun phrase indirect anaphora are shown as follows. 



R9 When a noun phrase Y is not a verbal noun, => 

{ (A topic which has the weight W and the distance D, W — D + P + S), 
(A focus which has the weight W and the distance D, W — D + P + S), 
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Table 4.5: The plausibility (P) that the referential property is a definite 



The score in the estimation of the referential property 


Plausibility P 


When the score of the definite noun phrase is the best 


5 


When the score of the definite noun phrase is equal to the score 
of the indefinite noun phrase or the generic noun phrase 





When the score of the definite noun phrase is 1 lower than the 
score of the indefinite noun phrase or the generic noun phrase 


-5 


When the score of the definite noun phrase is 2 lower than the 
score of the indefinite noun phrase or the generic noun phrase 


-10 


When the score of the definite noun phrase is more than 2 lower 
than the score of the indefinite noun phrase or the generic noun 
phrase 


— cxo 



Table 4.6: Points given to non-verbal nouns by the semantic similarity 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 


-10 


-2 


1 


2 


2.5 


3 


3.5 


4 



(A subject in a subordinate clause or a main clause of the clause, 23 + P + 
S)} 



The weights W of topics and foci are given in Table O and Table iA, re- 
spectively, and represent preference of the desired antecedent. The distance 
D is the number of the topics (foci) between the anaphor and a possible 
antecedent which is a topic (focus). The value P is given in Table ^^ by the 
score of the definiteness in referential property analysis described in Chap- 
ter ^ This is because it is easier for a definite noun phrase to have the 
antecedent than for an indefinite noun phrase. The value S is the semantic 
similarity between a possible antecedent and a Noun X of "Noun X NO 
Noun Y" . The semantic similarity is given by the similarity level in "Bunrui 



Goi Hyou" ||NLRI 64| as Table O 
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RIO When a noun phrase is a verbal noun, =^ 

{ (analyze in Zero Pronoun Resolution Module in Chapter |5|, 20) } 
In Zero Pronoun Resolution Module, indirect anaphora is resolved using 
the semantic constraint in a verb case frame and the distance between an 
anaphor and an antecedent. 

Rll When a noun phrase is a noun such as "ICHIBU" and "TONARI", and it 
modifies a noun X, =^ 
{ (the same noun as the noun X, 30)} 

R12 When a noun phrase is a noun such as "ICHIBU" and "TONARI" , and it 
is a case component of a verb, =^ 
{ (analyze in the module similar to RIO, 30)} 

4.3.3 Example of Analysis 



An example of resolution of indirect anaphora is shown in Figure [4.1| . Figure |4.1 
shows that the noun "KOUTEI BUAI (official rate)" is analyzed well. This is 
explained as follows. 

The system estimated the referential property of "KOUTEI BUAI (official 
rate)" to be indefinite in the method described in Chapter 2. By the rule R6 



in Section 3.4.2 the system took a candidate "Indefinite". When the candidate 
"Indefinite" has the best score, the system does not analyze indirect anaphora. 
By the rule R9 in Section [4.3.2 the system took four possible antecedents, SEI- 



DOKU (West Germany), JIKOKUTSUUKA (own currency), KYOUCHOU (co- 
operation), DORUDAKA (dollar's surge). The possible antecedents were given 
some points from the weight of topics and foci, the distance from the anaphor, 
and so on. The system properly judged that SEIDOKU (West Germany), which 
had the best score, was the desired antecedent. 

4.4 Experiment and Discussion 

Before determining antecedents in indirect anaphora, sentences were transformed 



into a case structure by the case analyzer [Kurohashi fc Nagao 94] as in Chapter 
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KONO DORUDAKA-WA KYOUCHOU-WO GIKUSHAKU SASETEIRU. 


(The dollar's surge) (cooperation) (is straining) 


(The dollar's surge is straining the cooperation. ) 


JIKOKUTSUUKA-WO MAMOROUTO SEIDOKU-GA KOUTEIBUAI-WO AGETA. 


(own currency) (to protect) (West Germany) (official rate) (raised) 


(West Germany raised its official rate to protect the Mark. ) 




Indefinite 


SEIDOKU 


JIKOKUTSUUKA 


KYOUCHOU 


DORUDAKA 






West Germany 


own currency 


cooperation 


dollar's surge 


R6 


10 










R9 




25 


-23 


-24 


-17 


Subject 




23 








T-F{W) 






14 


14 


20 


Distance(£>) 






-2 


-3 


-2 


Definite (P) 




-5 


-5 


-5 


-5 


Similarity(S') 




7 


-30 


-30 


-30 


Total Score 


10 


25 


-23 


-24 


-17 


Examples of "noun X NO KOUTEIBUAI (official rate)" 


"NIHON (Japan) NO KOUTEIBUAI (official rate)", 


"BEIKOKU (USA) NO KOUTEIBUAI (official rate)" 


Figure 4.1: Example of indirect anaphora resolution 



^. The errors made by the analyzer were corrected by hand. We used IPAL 
dictionary! [PAL 87|| as a verb case frame dictionary. We used the Japanese Co- 
occurrence Dictionary I EDR 95c| as a source of examples for "X NO Y". 

We show the result of anaphora resolution using both "X NO Y" and a verb 
case frame dictionary in Table |4.7| . We obtained a recall rate of 63% and a 
precision rate of 68% in the estimation of indirect anaphora on test sentences. 
This indicates that the information of "X NO Y" is useful to a certain extent 
when we cannot make use of the noun frame dictionary. We also tested when the 
system does not use any semantic information. The precision and the recall were 
lower. This indicates that semantic information is necessary. The experiment was 
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Table 4.7: Result 



Non-verbal Noun 


Verbal Noun 


Total 


Recall 


Precision 


Recall 


Precision 


Recall 


Precision 


Experiment in the case that the system does not use any semantic information 


85%(56/66) 


67%(56/83) 


40%(14/35) 


44%(14/32) 


69%(70/101) 


61%(70/115) 


53%(20/38) 


50%(20/40) 


47%(15/32) 


42%(15/36) 


50% (35/70) 


46% (35/76) 


Experiment using "X NO Y" and verb case frame 


91%(60/66) 


86%(60/70) 


66%(23/35) 


79%(23/29) 


82%(83/101) 


84% (83/99) 


63%(24/38) 


83%(24/29) 


63%(20/32) 


56%(20/36) 


63% (44/70) 


68% (44/65) 


Estimation for the hypothetical case when we can use noun case frame dictionary 


91%(60/66) 


88%(60/68) 


69%(24/35) 


89%(24/27) 


83%(84/101) 


88% (84/95) 


79%(30/38) 


86%(30/35) 


63%(20/32) 


77%(20/26) 


71% (50/70) 


82% (50/61) 



The upper row and the lower row of this table show rates on training sentences and 
test sentences, respectively. 



The training sentences are used to set by hand the values given in rules in Section 4.3.2 



Training sentences {example sentences [ [Walker et al 94| (43 sentences), a folk tale 
"KOBUTORI JIISAN" JNakao 85| (93 sentences), an essay in "TENSEIJINGO" (26 
sentences), an editorial (26 sentences)} 

Test sentences {a folk tale "TSURU NO ONGAESHI" |Nakao 851 (91 sentences), two 
essays in "TENSEIJINGO" (50 sentences), an editorial (30 sentences)} 
Precision is the fraction of the noun phrases which were judged to have the an- 
tecedents of indirect anaphora. Recall is the fraction of the noun phrases which have 
the antecedents of indirect anaphora. We use precision and recall to evaluate because 
the system judges that a noun which is not an antecedent of indirect anaphora is an 
antecedent of indirect anaphora, and we check these errors thoroughly. 
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performed by fixing all the semantic similarity values 5 to 0. 

Further, we made the estimation for the hypothetical case when we can use a 
noun case frame dictionary. The estimation was made as follows. We looked over 
the errors in the experience using "X NO Y" and a verb case frame dictionary. We 
regarded the errors made by one of the following three reasons as right answers. 

1. Proper examples do not exist in examples of "X NO Y" or a verb case frame 
dictionary. 

2. Wrong examples exist in examples of "X NO Y" or a verb case frame dic- 
tionary. 

3. A noun case frame is different from a verb case frame. 

If we will make a noun case frame dictionary by ourselves, the dictionary will have 



some errors, and the success ratio will be lower than the ratio in Table 4.7. 



Discussion of Errors 

Even if we have a noun case frame dictionary, there are certain pairs of nouns in 
indirect anaphoric relation that cannot be resolved by our framework. 

KON'NA HIDOI HUBUKI-NO NAKA-WO ITTAI DARE-GA KITA-NO- 
KA-TO IBUKARINAGARA, OBAASAN-WA IIMASHITA. 

(Wondering who could have come in such a heavy snowstorm, the old woman 

said:) 

"DONATA-JANA" 

("Who is it?") 

TO- WO AKETEMIRUTO, SOKO-NIWA ZENSHIN YUKI-DE MASSHI- 

RONI NATTA MUSUME-GA TATTE ORIMASHITA. 



(4.5) 



(She opened the door, and there stood before her a girl all covered with 
snow. ) 

The underlined "MUSUME (a daughter or a girl)" has two main meanings: a 
daughter and a girl. In the above example, "MUSUME" means girl and has no 
indirect anaphora relation. But the system incorrectly judged that it is the daugh- 
ter of "OBAASAN (the old woman)". This is a problem of noun role ambiguity 
and is a very difficult problem to solve. 
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The following example is also a difficult problem. 

SHUSHOU-WA TEIKOU -NO TSUYOI SENKYOKU-NO KAISHOU-WO MIOKUTTA. 
(prime minister) (resistance) (very) (electoral district) (modification) (give up) 
(The prime minister gave up the modification of some electoral districts where 
the resistances were very hard.) 

(4.6) 

The underlined "TEIKOU (resistance)" appears to refer indirectly to "SENKYO- 
KU (electoral district)" from the surface expression. But actually "TEIKOU (re- 
sistance)" refers to the candidates of "SENKYOKU (electoral district)" not to 
"SENKYOKU (electoral district)" itself. To arrive at this conclusion it is neces- 
sary to use a two step relation, "an electoral district =^ candidates" , "candidates 
=^ resist" in sequence. However it is not easy to change our system to deal with 
two step relations because if we apply the use of two relations to nouns, many 
nouns which are not in an indirect anaphoric relation will be incorrectly judged as 
indirect anaphora. A new method is required to infer two relations in sequence. 

4.5 Consideration of Construction of Noun Case 
Frame Dictionary 

We used "X NO Y (Y of X)" to resolve indirect anaphora. But we will get a higher 
accuracy rate if we can utilize a good noun case frame dictionary. Therefore we 
have to consider how we can construct a noun case frame dictionary. A key is 
to get the detailed meaning of "NO (of)" in "X NO Y". If it is automatically 
obtainable, a noun case frame dictionary will be constructed automatically. If 
the semantic analysis of "X NO Y" is not done well, how do we construct the 
dictionary? We think that it is still good to construct it using "X NO Y". For 
example, we arrange "noun X NO noun Y" in the order of the meaning of "noun 
Y" , arrange them in the order of the meaning of "noun X" , delete some of them 
whose "noun X" are adjective nouns, and obtain Table ^^. In this case, we use 
the thesaurus dictionary "Bunrui Goi Hyou" [ NLRI 64 ] to get the meanings of 



nouns. We think that it is not difficult to construct a noun case frame dictionary 



from Table 4.8 by hand. We will make a noun case frame dictionary by removing 
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Table 4.8: Examples of arranged "X NO Y" 



Noun Y 


Arranged Noun X 


KOKUMIN 
(nation) 


<IIuman> AITE (partner) <Organization> KUNI (country), SEN- 
SHINKOKU (an advanced country), RYOUKOKU (the two coun- 
tries), NAICHI (inland), ZENKOKU (the whole country), NI- 
HON (Japan), SOREN (the Soviet Union), EIKOKU (England), 
AMERIKA (America), SUISU (Switzerland), DENMAAKU (Den- 
mark), SEKAI (the world) 


GENSHU 
(the head of 
state) 


<IIuman> RAIHIN (visitor) <Organization> GAIKOKU (a foreign 
country), KAKKOKU (each country), POORANDO (Poland) 


YANE (roof) 


<Organization> HOKKAIDO (Hokkaido), SEKAI (the world), 
GAKKOU (school), KOUJOU (factory), GASORINSUTANDO 
(gas station), SUUPAA (supermarket), JITAKU (one's home), 
HONBU (the head office) <Product> KURUMA (car), JUU- 
TAKU (housing), IE (house), SHINDEN (temple), GENKAN (en- 
trance), SHINSHA (new car) <Phcnomcnon> MIDORI (green) 
<Action> KAWARABUKI (tile-roofed) <Mental> HOUSHIKI 
(method) <Character> KEISHIKI (form) 


MOKEI 
(model) 


<Animal> ZOU (elephant) <Nature> FUJISAN (Mt. Fuji) 
<Product> IMONO (an article of cast metal), MANSHON (an 
apartment house), KAPUSERU (capsule), DENSHA (train), HUNE 
(ship), GUNKAN (warship), HIKOUKI (airplane), JETTOKI (jet 
plane) <Action> ZOUSEN (shipbuilding) <Mental> PURAN 
(plan) <Character> UNKOU (movement) 


GYOUJI 

(event) 


<Human> KOUSHITSU (the Imperial Household), OUSHITSU (a 
Royal family), lEMOTO (the head of a school) <Organization> 
NOUSON (an agricultural village), KEN (prefecture), NIHON 
(Japan), SOREN (the Soviet Union), TERA (temple), GAKKOU 
(school) <Action> SHUUNIN (take up one's post), MATSURI 
(festival), IWAI (celebration), JUNREI (pilgrimage) <Mental> 
KOUREI (an established custom), KOUSHIKI (formal) 


JINKAKU 
(personality) 


<Human> WATASHI (myself), NINGEN (human), SEISHOUNEN 
(young people), SEIJIKA (statesman) 
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"AITE (partner)" in the line of "KOKUMIN (nation)", "RAIHIN (visitor)" in the 
hne of "GENSHU (the head of state)", and noun phrases which mean characters 
and features. When we look over the noun phrases in a certain line and almost all 
of them mean countries, we will also include the feature that countries are easy to 
be filled by using semantic markers. When we make a noun case frame dictionary, 
we must remember that examples of "X NO Y" are insufficient, and must add 
examples. Since examples are arranged in the order of meaning in this method, 
it will not be so difficult to add examples. 

4.6 Summary 

We presented how to resolve indirect anaphora in Japanese nouns. When we 
analyze indirect anaphora, we need a noun case frame dictionary containing in- 
formation about noun relations. But no noun case frame dictionary exists at 
present. Therefore, we used examples of "X NO Y (Y of X)" and a verb case 
frame dictionary. We experimented with the estimation of indirect anaphora by 
using this information, and obtained a recall rate of 63% and a precision rate of 
68% on test sentences. This indicates that the information of "X NO Y" is useful 
when we cannot make use of a noun case frame dictionary. We made an estimation 
in the case that we can use a noun case frame dictionary, and obtained results 
with the recall and the precision rates of 71% and 82%, respectively. Finally we 
proposed how to construct a noun case frame dictionary from examples of "X NO 
Y". 



Chapter 5 

An Estimate of Referents of 
Pronouns 



5.1 Overview 

We described in Chapter S and Chapter how to estimate the referents of noun 
phrases. This chapter describes how to resolve the referents of pronouns: demon- 
strative pronouns, personal pronouns, and zero pronouns. Pronoun resolution is 
especially important for machine translation. For example, if the system cannot 
resolve zero pronouns Q, the system cannot translate sentences with them from 
Japanese into English. When the word order of sentences is changed and the 
pronominalized words are changed in translating into English, the system must 
detect the referents of the pronouns. 

There has been much work done in pronoun resolution |Nagao et al 76 1 



[ Kameyama St ] [ Yamamura et al 92 1 [ Walker et al 94 1 [ Takada fc Doi 94 1 



[Nakaiwa & Ikehara 95 1. Major distinguishing features of our work are as follows: 



In conventional pronoun resolution methods, semantic markers have been 
used for semantic constraints. On the other hand, we use examples for 
semantic constraints and show in our experiments that examples are as 
useful as semantic markers. The result is important because the cost of 



^Ellipses of noun phrases are called zero pronouns. 
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constructing the case frame using semantic markers is generally higher than 
the cost of constructing the case frame using examples. 

• We use examples in the form "X of Y" for estimating referents of demon- 
strative adjectives. 

• We deal with the case when a demonstrative refers to elements which appear 
later. 

• We resolve a personal pronoun in quotation by estimating the speaker and 
the hearer. 

In this work, we used almost all the potentials of conventional methods and 
proposed new method. 



In Section 5.2, we explain how the system estimates the referent of a pronoun. 



Next, we explain the rules for demonstratives, personal pronouns, and zero pro- 



nouns in Sections p^, pA, and p^, respectively. In Section 5.6, we report the 



results of experiments using these rules. In Section 5.7, we conclude this chapter. 



5.2 The Framework for Estimating the Referent 

Pronoun resolution is performed in the framework similar to that in Chapter y 
and Chapter ^. The antecedents of pronouns are determined by heuristic rules 
from left to right. Using these rules, our system gives possible antecedents points, 
and it judges that the possible antecedent having the maximum total score is the 
desired antecedent. 

Heuristic rules are classified into two kinds of rules: Candidate enumerating 
rules and Candidate judging rules. Candidate enumerating rules are used in enu- 
merating candidate antecedents and giving them points (which mean plausibility 
of the proper antecedent). Candidate judging rules are used in giving the candi- 
date antecedents taken by Candidate enumerating rules points. These rules are 
shown in Figure ^^ and Figure [5^ . Surface expressions, semantic constraints, ref- 
erential properties, etc., are written as conditions in Condition part. A possible 
antecedent is written in Possible-Antecedent part. Point means the plausibility of 
the possible antecedent. 
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Condition =^ {Proposal Proposal ..} 
Proposal := ( Possible- Antecedent Points ) 

Figure 5.1: Form of Candidate enumerating rule 



Condition =^ ( Points ) 
Figure 5.2: Form of Candidate judging rule 



An estimation of the referent is performed by using the total scores of possible 
antecedents given by Candidate enumerating rules and Candidate judging rules. 
First, the system applies all Candidate enumerating rules to the anaphor and 
enumerates candidate antecedents having the points. Next, the system applies all 
Candidate judging rules to all the candidate antecedents and sums up the score 
of each candidate antecedent. Consequently, the system judges the candidate an- 
tecedent having the best score is the proper antecedent. If the candidate referents 
having the best score are plural, the candidate referent taken in the first order g 
is judged as the proper antecedent. 

We made 50 Candidate enumerating rules and 10 Candidate judging rules for 
analyzing demonstratives, 4 Candidate enumerating rules and 6 Candidate judging 
rules for analyzing personal pronouns, and 19 Candidate enumerating rules and 4 
Candidate judging rules for analyzing zero pronouns. All of the rules are described 
in Appendix p. Some of the rules are described in the following sections. 

5.3 Heuristic Rule for Demonstrative 



We made heuristic rules for demonstratives by consulting the papers of [ N"LRI 81 1 
[ Bayashi 83| |Takahashi et al 9C]|Kinsui & Takubo 92 1 and examining Japanese sen- 
tences by hand. Demonstratives have three categories: demonstrative pronouns, 
demonstrative adjectives, and demonstrative adverbs. In the following sections, 



The order is based on the order applying rules. 
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Table 5.1: The weight in the case of topic 



Surface Expression 


Example 


Weight 


Pronoun/Zero-Pronoun GA/WA 


fJohnGA fsubiectllSHITA fdone). 


21 


Noun WA/NIWA 


JohnWA fsubiectlSHITA fdoV 


20 



Table 5.2: The weight in the case of focus 



Surface Expression 


Example 


Weight 


Pronoun/Zero-Pronoun 
WO(object)/NI(to) 
/KARA (from) 


(JohnNI (to))SHITA (done). 


16 


Noun GA 
(subject)/MO/DA/NARA 


JohnGA (subject)SHITA (do). 


15 


Noun WO (object)/Nl/, /. 


JohnNI fobiect)SHITA (do). 


14 


Noun HE (to)/DE (in) /KARA 
(from) 


GAKKOU (school)HE (to)IKU (go). 


13 



we explain the rules for analyzing demonstratives. 



5.3.1 Rule for Demonstrative Pronoun 

Rule in the Case when the Referent is a Noun Phrase 

Candidate enumerating rulel 

When a pronoun is a demonstrative pronoun or "SONO (of it) / KONO (of 

this) / ANO (of that)", 

{(A topic which has the weight W and the distance D, W — D — 2) 

(A focus which has the weight W and the distance D, W — D + 4)} 

This bracket expression represents the lists of proposals in Figure |5.1|. The 



definition and the weight W of topic and focus are shown in Table 5.1 and 



Table 5.2. The distance (D) is the number of topics and foci between the 
demonstrative and the possible referent. Since a demonstrative more often 
refer to foci than a zero pronoun, we add the coefficient —2, +4 as compared 



5.3. HEURISTIC RULE FOR DEMONSTRATIVE 61 

with the heuristic rules in zero pronoun resolution. 

The score (in other words, the certification value) of a candidate referent 
depends on the weight of topics/foci and the geographical distance between the 
demonstrative and the candidate referent. 



Rule when the Referent is a Verb Phrase 

Candidate enumerating rule2 

When a pronoun is "SORE/ARE/KORE" or a demonstrative adjective, 

{( The previous sentence (or the verb phrase which is a conditional form 
containing a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 15)} 

The following is an example of a pronoun referring to the verb phrase of the 
previous sentence. 

TENGU-TACHI-WA MAMONAKU YATTEKITE 
(The tengus) (presently) (came) 

(Presently, they came) 

MAENOBAN-NO-YOUNI UTATTARI ODOTTARI SHI-HAJIMEMASHITA. 

(the previous night) (sing) (dance) (begin to do) (5-1) 

(and began singing and dancing just as they had done the previous night.) 

OJIISAN-WA SORE- WO MITE, KON'NAHUUNI UTAI-HAJIMEMASHITA. 
(the old man) (it) (see) (as follows) (begin to sing) 

(When the old man saw this , he began to sing as follows. ) 

In these sentences, a demonstrative pronoun "SORE (it)" refers to the event 
"TENGUTACHI-GA UTATTARI ODOTTARI SHI-HAJIMEMASHITA (tengu 
began singing and dancing just as they had done the previous night.)". 

The following is an example of a pronoun referring to a verb phrase (the event) 
containing a conjunctive particle such as "GA", "DAGA", and "KEREDO" in the 
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Table 5.3: Points given in the case of demonstrative pronouns 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 








-10 


-10 


-10 


-10 


-10 


-10 



same sentence. 

OJllSAN-WA ISSHOUKENMEINl UTAI SOSHITE ODORIMASHITAGA, 

(the old man) (one's best) (sing) (and) (dance) 

(The man did his best singing and dancing,) 



(5.2) 



SORE -WA KOTOBADE-IIARAWASENAIHODO 

(they) (unspeakably) 

( but they were unspeakably poor.) 



HETAKUSODESHITA. 
(poor) 



Rule Using the Feature that Demonstrative Pronouns usually 
do not Refer to People 

Candidate judging rulel 

When a pronoun is a demonstrative pronoun and a candidate referent has 
a semantic marker HUM (human), it is given —10. We use Noun Semantic 



Marker Dictionary! Watanabe et al 92] as a semantic marker dictionary. 



Candidate judging rule2 

When a pronoun is a demonstrative pronoun, a candidate referent is given the 
points in Table [5^ by using the highest semantic similarity between the candi- 
date referent and the codes {5200003010 5201002060 5202001020 5202006115 



5241002150 5244002100} in "Bunrui Goi Hyou (BGH)" jNLRI 64| which sig- 
nify human beings. When we calculate the semantic similarity, we use the 



modified code table in Table 5.4. The reason for this modification is that 



some codes in BGH [ NLRI 64 ] are incorrect. 



These rules use the feature that a demonstrative pronoun rarely refer to people, 
and reduce candidates of the referent. For example, we find "SORE (it)" in the 
following sentences refers to "KONPYUUTA (computer)", because "SORE (it)" 
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Table 5.4: Modification of category number of "BUNRUI GOI HYOU" 



Semantic Marker 


Original 


Modified 




code 


code 


ANI(animal) 


156 


511 


HUM (human) 


12 [0-4] 


52 [0-4] 


ORG(organization) 


125,126,127,128 


535,536,537,538 


PLA(plant) 


155 


611 


PAR(part of living thing) 


157 


621 


NAT (natural) 


152 


631 


PRO (products) 


14 [0-9] 


64 [0-9] 


LOG (location) 


117,125,126 


651,652,653 


PHE (phenomenon) 


150,151 


711,712 


AGT (action) 


13 [3-8] 


81 [3-8] 


MEN(mental) 


130 


821 


GH A (character) 


11[2-58],158 


83[2-58],839 


REL (relation) 


111 


841 


LIN (linguistic products) 


131,132 


851,852 


The others 


110 


861 


TIM (time) 


116 


all 


QUA(quantity) 


119 


bll 



"125" and "126" are given two category number. 
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Table 5.5: Points given demonstrative pronouns which refer to places 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 


-10 


-5 





5 


10 


10 


10 


10 



refers to only a thing which is not human and the noun which is near "SORE 
(it)" and which is not human is only "KONPYUUTA (computer)". 

TAROO-WA SAISHIN-NO KONPYUUTA-WO KAIMASHITA. 

(Taroo) (new) (computer) (buy) 

(Taroo bought a new computer.) 



(5.3) 



JON-NI SASSOKU SORE -WO MISEMASHITA. 

(John) (at once) (it) (show) 

([Taroo] showed it at once to John. ) 



Rule with Feature that "KOKO" and "SOKO" Often Refer 
to Locations 

Candidate judging ruleS 

When a pronoun is "KOKO (here) / SOKO (there) / ASOKO (over there)" 
and a candidate referent has a semantic marker LOC (location), the candidate 
referent is given 10 points. 



Candidate judging ruleA 

When a pronoun is "KOKO/SOKO/ASOKO", a candidate referent is given 
the points in Table |5.5| by using the semantic similarity between the candi- 
date referent and the codes {6563006010 6559005020 9113301090 9113302010 
6471001030 6314020130} which signify locations in BGH | jNLRI 64U . 



"SOKO (there)" commonly refers to location. For example, "SOKO" in the 
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following sentences refers to "BAITEN (shop)" which signifies location. 

TAROO-GA KOUEN-DE HON-WO YONDE-IMASHITA. 

(Taroo) (in the park) (book) (be reading) 

(Taroo was reading a book in the park.) 

KOORA-WO KAINI BAITEN-NI HAIRIMASHITA. 

(cola) (buy) (shop) (enter) (5-4) 

(Taroo entered a shop to buy a cola.) 

JIROO-WA SOKO-DE GUUZEN DEKUWASHIMASHITA. 

(Jiroo) (there) (by chance) (meet) 

(Jiroo met Taroo there by chance. ) 

Rule when "KOKODE" or "SOKODE" is Used as a Conjunction 

Candidate enumerating ruleS 

When a pronoun is "KOKODE" or "SOKODE", 
{(the pronoun is used as conjunctions, 11)} 

This rule is for when "KOKODE (here or then)" or "SOKODE (there or 
then)" is used as conjunctions. If a word which signifies location is not found near 
"KOKODE" or "SOKODE" , the candidate which is hsted by this rule has the 
highest score, and "KOKODE" or "SOKODE" is judged as a conjunction. By 
using this rule, "SOKODE" in the following sentences is judged to be a conjunc- 
tion. 

OJIISAN-WA TENGU-GA KOWAKUNAKUNATTE-IMASHITA. 

(old man) (tcngu) (lose all fear of) 

(The old man lost all fear of the "tengu.") 

(5.5) 
SOKODE OJIISAN-WA KAKURETEITA ANA-KARA DETEKIMASHITA. 

(so) (old man) (be hiding) (hole) (leave) 

(So, he left the hole where he had been hiding.) 

This rule is necessary when the system translates "SOKODE" into English, judges 
whether it is used as a demonstrative or as a conjunction, and translates it into 
"there" or "then." 
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Rule in the Case of Cataphora 

Demonstrative pronouns can be intersentential cataphoric |^. In this case, we 
analyze a demonstrative pronoun by using rules based on Matsuoka's method 



[Matsuoka et al 95]. This work |Matsuoka et al 95 1 also deals with cases in which 



demonstrative pronouns refer to the next sentences. But these cases rarely hap- 
pen. When we do not use this rule, the precision increases. For this reason we do 
not use this rule. 

The Other Rules 

Candidate enumerating ruleA 

When a pronoun is "SORE/ARE/KORE" or a demonstrative adjective and 
the previous bunsetsu contains the expression of the predicative form of a 
verb or the expression of enumerating examples such as "TOKA (and so 
on)," {(the expression, 40)} 

Candidate enumerating rule5 

When a pronoun is a demonstrative pronoun, a demonstrative adverb, or a 
demonstrative adjective, 
{(Introduce an individual, 10)} 

This rule is used when there is no referent of a pronoun in the sentences. 
This rule makes the system introduce a certain individual. 

5.3.2 Rule for Demonstrative Adjective 

Demonstrative pronouns such as "KONO (this)", "SONO (the)", "ANO (that)", 
"KON'NA (like this)", and "SON'NA (like it)" are classified into two reference 
categories: (/entei-reference and daikou-ieference. 

In a Geniei-reference although a demonstrative adjective does not refer to an 
entity by itself, the phrase of "demonstrative adjective + noun phrase" refers to 



^ Cataphora is the phenomenon that an anaphor refers to elements which appear later. 
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the antecedent. For example "KONO O JUS AN (this old man)" in the following 
sentences: 

OJIISAN-WA TENGUTACHI-NO-MAENI DETEITTE ODORI-HAJIMEMASHITA 

(old man) (before the "tengu" ) (appear) (begin to dance) 

(He appeared before the "tengu," and began to dance.) 

(5.6) 
KEREDOMO KONO OJIISAN-WA UTA-MO ODORI-MO HETAKUSO-DESHITA 

(but) (this old man) (sing) (dance) (poor) 

(But the old man was a poor singer, and his dancing was no better. ) 

In this example, although the demonstrative "KONO (this)" does not refer to 
"OJIISAN (old man)" in the first sentence, the noun phrase "KONO OJIISAN 
(this old man)" refers to "OJIISAN (old man)" in the first sentence. 

D aikou-r eieience is a demonstrative adjective that refers to an entity. In this 
case, we can analyze "SONO (the)" as well as "SORE-NO (of it)" . In the following 
sentences, "SONO" refers to "TENGU". It is the case of doi/cow-reference. 

MATA KARASU-NO-YOUNA KAO-WO-SHITA TENGU-MO IMASHITA 

(also) (like crows) (with face) ("tengu") (exist) 

(There were also some "tengu" with faces like those of crows. ) 

(5.7) 
SONO KUCHI- WA TORINO-KUCHIBASHI-NOYOUNI TOGATTETMASHITA 

(their mouths) (like the beaks of birds) (be pointed) 

( Their mouths were pointed like the beaks of birds. ) 

Rules for (/entei-reference and daikou-ieference are as follows: 

Rule for G entei-Hef erence 

Candidate enumerating ruleQ 

When a pronoun is "so-series demonstrative adjective + noun a," 

{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of the noun a and which has the weight W 

and the distance D, W - D*2 + lf)) 

(the focus which is a subordinate of the noun a and which has the weight W 

and the distance D, W - D*2 + 10)} 
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The definition and the weight (W) of topic and focus are shown in Table 



5.1 and Table 5.2. 



When a possible referent is a topic, the distance (D) between the esti- 
mated noun phrase and the possible referent is the number of topics between 
them. When a possible referent is a focus, the distance (D) is the number of 
foci between them. 

The relations between a super-ordinate word and a subordinate word is 
detected by the last word in the definition of the word a in EDR Japanese 
word dictionary! EDR 95a|] is judged to be the super-ordinate of the word 



a 



[ Tsurumaru et al 91 ]. 



Since a soseries demonstrative refers to noun phrases nearer than a /co- 
series demonstrative, we give the coefficient 2 in the second term. 

Candidate enumerating rule! 

When a pronoun is "Soseries demonstrative adjective + noun a," 

{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of the noun a and which has the weight W 

and the distance D, VF - I? + 30) 

(the focus which is a subordinate of the noun a and which has the weight W 

and the distance D, W - D + 30)} 

Candidate enumerating ruleS 

When a pronoun is "a-series demonstrative adjective + noun a," 

{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of the noun a and which has the weight W 

and the distance D, W - D*OA + 30) 

(the focus which is a subordinate of the noun a and which has the weight W 

and the distance D, W -D*OA + 30)} 

Because of the above three rules, when a pronoun is "demonstrative adjective 
+ noun phrase a" and there is the same noun phrase a near it, it is judged to be 
"i^eniei-reference" and is selected as a candidate of the referent. When there is a 
subordinate of a noun phrase a near it, it is also selected as a candidate of the 
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Table 5.6: Points given to so-series demonstrative adjective 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 


-10 


-2 


-1 





1 


2 


3 


4 



referent. These rules give higher points to a candidate referent than in the other 
rules. The following is an example of the "demonstrative adjective + noun phrase 
a" referring to the subordinate of the noun phrase a. 

OJIISAN-WA TOONOITEIKU TSURU-NO SUGATA-WO MIOKURIMASHITA. 

(old man) (recede) (crane) (figure) (watch) 

(The old man watched the receding figm'e of the crane. ) 



(5.8) 



" ANO TORI - WO TASUKETE YOKATTA" TO 
(save) 



IIMASHITA. 

(say) 



(that bird) (save) (glad) 

("I'm glad I saved that bird ," said the old man to himself. ) 

In this example, the underlined "ANO TORI (that bird)" refers to a subordinate 
"TSURU (crane)" in the previous sentence. 



Rules for Z)a«A;ou-Reference of So-Series Demonstrative Adjective 

Candidate judging ruleb 

When a pronoun is a so-series demonstrative adjective, the system consults 
examples of the form "noun X NO noun Y" whose noun Y is modified 
by the pronoun, and gives a candidate referent the point in Table ^.6| by 
the similarity between the candidate referent and noun X in "Bunrui Goi 
Hyou" pvILRI 64U . The Japanese Co-occurrence Dictionary pDR 95c | is used 
as a source of examples of "X NO Y" . 



This rule is for checking the semantic constraint (For a dazA;ou-reference, can- 
didates of the referent are selected by Candidate enumerating rule! in Section 

nil.). 

We explain how to use the rule in the underlined "BONO (the)" in the sen- 



tences (5.7). First, the system gathers examples of the form "Noun X NO KUCHI 



( mouth of Noun X )". Table 5.7 shows some examples of "Noun X NO KUCHI ( 
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Table 5.7: Examples of the form "the mouth of Noun X" 



Examples of Noun X 



HUKURO (sack), RUPORAITA(documentary writer) IIN(member), 
AKACHAN(baby), K ARE (he) 



Table 5.8: Points given in the case of non-so-series demonstrative adjective 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact match 


Point 


-30 


-30 


-30 


-30 


-10 


-5 


-2 






mouth of Noun X )" in the Japanese Co-occurrence Dictionary! EDR 95c |. Next, 
the system checks the semantic similarity between candidate referents and Noun 
X, and judges that the candidate referent which has a higher similarity is a better 
candidate referent. In this example, "TENGU" is semantically similar to Noun X 
in that they are living things. At last, the system selects "TENGU" as the proper 
referent. 



Rules when Non-iS'o-Series Demonstrative has D aikou-Hef erence 



Candidate judging ruled 

When a pronoun is a non-so-series demonstrative adjective, the system con- 
sults examples of the form "Noun X NO(of ) Noun Y (Y of X)" whose Noun Y 
is modified by the pronoun, and gives candidate referents the point in Table 
|5.8| by the similarity between the candidate referent and noun X in "Bunrui 
Goi Hyou"|NLRI 64]. Since a non-so-series demonstrative adjective rarely is 



a daikou reference [NLRI 81] [Yamamura et al 92], the point is lower than 
that in the case of so-series. 
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Rule when a Pronoun Refers to a Verb Phrase 

As in a demonstrative pronoun, a demonstrative adjective can refer to the meaning 
of the verb phrase in the previous sentence Fl. 



TSUMARI, NINGEN-NO NOU-YORI YUUSHUUNA PATAAN NINSHIKI 
PUROGURAMU-GA TSUKURENAI DANKAI-DEWA, HIJOUNI HUKUZAT- 
SUDE OMOSHIROSOUNA JISHOU-NITSUITEWA, MAZU SONO GAZOU 
WO TSUKUTTE, SONO DEETA-WO BUTSURIGAKUSHA-NI GINMI- 
SASERU HITSUYOU-GA-ARU. 

(Until scientists invent a pattern recognition program that works better than 
the liuman brain, it will be necessary to produce images of the most compli- 
cated and interesting events so that physicists can scrutinize the data.) 
1980 NEN DAI-NO SHOTOU-NI LEP JIKKEN SOUCHI-NO SEKKEI-GA 
HAJIMATTA-TOKI, KONO SENRYAKU -GA SAIYOU SARETANODATTA. 
(This strategy was adopted by workers when they began to design the LEP 
detectors in the early 1980s.) 



The referent of "KONO SENRYAKU (this strategy)" is the meaning of the pre- 
vious sentence. The resolution in this case is performed as follows: When there 
are no noun phrases which are suitable for the referent of "KONO (this)" or 
the referent of "KONO SENRYAKU (this strategy)" near the demonstrative, the 
system judges that the meaning of the previous sentence is the proper referent, 
provided that, as in a demonstrative pronoun when the verb phrase containing a 
conjunctive particle such as "GA", "DAGA", and "KEREDO" or a conditional 
form exists in the same sentence, the verb phrase is judged to be the proper ref- 
erent. The above procedure is done by Candidate enumerating rule2 in Section 



5.3.1 



^ It is necessary to distinguish between daikou-teierence and jeniei-reference even in the case 
when a pronoun refers to a verb phrase. But, in this thesis, we do not distinguish them because 
of the difficulty of the problem. 
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Table 5.9: The result of the investigation whether "KON'NA + noun (noun like 
this)" refers to the previous sentences or the next sentences 



Postpositional particle 


the previous sentence 


the next sentence 


WA (topic) 


9 





WA-NAI 


5 





NI (indirect object) 


17 





NI-MO 


1 





NI-WA 


2 





DE (place) 


15 





DE-WA 


5 





NO (possessive) 


9 





SURA 


2 





GA (subject) 


27 


22 


WO (object) 


43 


26 


MO (also) 


2 


4 


DE-WA-NAI 





1 


Total 


137 


53 



Rule for "KON'NA + Noun (noun like this)" 

"KON'NA Noun" can also refer to the next sentences in addition to a noun phrase 
and the previous sentences. 

OJIISAN-WA ODORINAGARA KON'NA UTA -WO UTAIMASHITA. 

(old man) (dance) (song like this) (sing) 

(As he danced, he sang the following song: ) 

(5.9) 
"TENGU TENGU HACHI TENGU. 

(tengu) (tengu) (eight tengu) 

("'Tengu,' 'tengu,' Eight 'tengu."') 

In the above example, "KON'NA UTA (song like this)" refers to the next sentence 
"TENGU, TENGU, HACHI TENGU." 

But we cannot decide whether "KON'NA + noun (noun like this)" refers to 
the previous sentences or the next sentences only by the expression of "KON'NA 
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+ noun (noun like this)" itself. To make the decision, we gather 317 sentences 
containing "KON'NA (like this)" from about 60,000 sentences in TENSEIJINGO 
and editorials (1986 and 1987), and count the total frequency that "KON'NA" 
refers to the previous sentences or to the next sentences. The result is shown in 
Table ^. This table indicates that "KON'NA + noun" followed by the other 
particles of the particles "GA" and "WO," which are used when representing new 
information, very often refers to the previous sentence. Therefore, the system 
judges that the desired antecedent is the previous sentence. When "KON'NA + 
noun" followed by the particles "GA" and "WO," the proper referent is deter- 
mined by the expression of quotation marks (",") as well as Matsuoka's method 



[ Matsuoka et al 95 1. 



5.3.3 Rule for Demonstrative Adverb 

Rule when 6'o-Series Demonstrative Adverb Refers to 
the Previous Sentences 

Candidate enumerating ruled 

When an anaphor is a so-series demonstrative adverb such as "SOU (so)," 
{(the previous sentences, 30)} 
The example is as follows. 

"TENGU TENGU HACHI TENGU. 

(tengu) (tengu) (eight tengu) 

("'Tengu,' 'tengu,' Eight 'tengu."') 

(5.10) 
SOU UTATTA-NOWA SOKONI HACHIHIKI-NO TENGU-GA ITAKARA-DESU. 

(sing so) (there) (eight) (tengu) (exist) 

(He sang so because he counted eight of them there. ) 

"SOU (so)" refers to the previous sentence "TENGU TENGU HACHI TENGU". 

Rule when S'o-Series Demonstrative Adverb Cataphorically 
Refers to the Verb Phrase in the Same Sentence 

Candidate enumerating rulelO 
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When an anaphor is "SOU/SOUSHITE/SONOYOUNI" and is in the subor- 
dinate clause which has a conjunctive particle such as "GA", " DAG A", and 
" KEREDO" or an adjective conjunction such as "YOUNI", 
{(the main clause, 45)} 



This rule is based on Matsuoka's method |Matsuoka et al 95 1. 



Rule when JTo-Series Demonstrative Adverb Refers to 
the Previous Sentences 

Candidate enumerating rulell 

When an anaphor is a A;oseries demonstrative adverb such as "KOU (in this 

way)", 

{(the previous sentences, 25)} 



Rule when JTo-Series Demonstrative Adverb Refers to the Next 
Sentences 

Candidate enumerating rulel2 

When an anaphor is a fcoseries demonstrative adverb, 
{(the next sentences, 26)} 

A /co-series demonstrative adverb can also refer to the next sentences in addi- 
tion to the previous sentences. 

TENGU-TACHI-WA TOUTOU KOU IIMASHITA. 

(tengu) (finally) (like this) (say) 

(The "tengu" finally said as follows :) 

(5.11) 
KYOU-NO OMAE-WA DAME-DANA. ... 

(today) (you) (no good) 

("You're no good today. ...") 

In the example, "KOU (in this way)" refers to the next sentences. When "KOU 
(in this way)" is a part of the typical form such as "KOU SHITE" and "KOU 
SUREBA," it often refers to the previous sentences. Therefore if "KOU (in this 
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way)" is a part of this typical form, the system judges that the desired antecedent 
is the previous sentence. Otherwise, the system judges that the desired antecedent 
is the next sentence. To implement this procedure, we made the following rules. 

Candidate enumerating rulelS 

When an anaphor is a part of "KOU/KON'NAHUUNI" + conditional form 
or "KOU SHITE" and is not the last word in the sentence, 
{(the previous sentence, 7)} 

5.4 Heuristic Rule for Personal Pronoun 

Candidate enumerating rulel 

When an anaphor is a first personal pronoun, 
{(the first person (the speaker) in the context, 25)} 

Candidate enumerating rule2 

When an anaphor is a second personal pronoun, 
{(the second person (the hearer) in the context, 25)} 

A first or second personal pronoun is often presented in quotation, and can be 
resolved by estimating the first person (speaker) or the second person (hearer) in 
advance. The estimation of the first person and the second person is performed 
by regarding ga-case component and ni-case component of the verb phase which 
represents the speaking action of the quotation as the first person and the second 
person, respectively. The detection of the verb phase representing the speaking 
action is performed as follows. If the quotation is followed by a speaking action 
verb phrase such as "TO ITTA (was said)," the verb phrase is regarded as the 
verb phase representing the speaking action. Otherwise, the last verb phrase in 
the previous sentence is regarded as the verb phase representing the speaking 
action fj For example, the second personal pronoun "OMAESAN (you)" in the 



^ There are some errors in the detection of the verb phrase representing the speaking action 
in this method. But in the sample texts used in the experiment of this thesis, all detection could 
be performed properly in this method. 
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following sentences refers to the second person "OJIISAN (the old man)" in this 
quotation. 

"ASU, MATA MAIRIMASUYO." TO, 

(tomorrow) (again) (come) 
("I'll come again tomorrow,") 

OJIISAN-WA YAKUSOKU-SHIMASHITA. 

(old man) (promise) 

(promised the old man.) 

(5.12) 
"MOCHIRON OMAESAN- WO UTAGAUWAKEDEWANAINODAGA," 

(of course) (you) (don't mean to doubt) 

("Of course, we don't mean to doubt you,") 

TENGU-GA OJIISAN-NI IIMASHITA. 

(tengu) (old man) (said) 

(said one of the "tengu" to the old man .) 

The fact that the second person in the quotation is "OJIISAN" is estimated by the 
fact that nz-case component of the verb phrase "IIMASHITA (said)" representing 
the speaking action of the quotation is "OJIISAN" . 

Candidate enumerating ruleS 

When an anaphor is a third personal pronoun, 
{(a first person, —10) (a second person, —10)} 

Personal pronouns are generally analyzed by the following three rules: The 
system lists candidate referents with the scores (the certification value) considering 
topic/focus and the distance between the anaphor and the candidate referents by 
Candidate enumerating ruleA, and increases the score of the candidate referents 
which signify human beings by Candidate judging rulel and Candidate judging 
rule2. 

Candidate enumerating ruleA 

When an anaphor is a personal pronoun, 

{(A topic which has the weight W and the distance D, W — D — 2) 
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Table 5.10: Points given in the case of personal pronoun 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 








3 


7 


10 


10 


10 


10 



(A focus which has the weight W and the distance D, W — D + 4)} 

Candidate judging rulel 

When an anaphor is a personal pronoun and a candidate referent has a se- 
mantic marker HUM, the candidate referent is given 10 points. 



Candidate judging rule2 

When an anaphor is a personal pronoun, a candidate referent is given the 
points in Table 5.10| by using the highest semantic similarity between the can- 
didate referent and the code {5200003010 5201002060 5202001020 5202006115 



5241002150 5244002100} which signifies human being in BGH [|NLRI 64|] . 



5.5 Heuristic Rule for Zero Pronoun 



Rule Proposing Candidate Referents of General Zero Pronoun 

Candidate enumerating rulel. 

When a zero pronoun is a ga-case component, 

{(A topic which has the weight W and the distance D, W — D *2 + 1) 

(A focus which has the weight W and the distance D, W — D + 1) 

(A subject of a clause coordinately connected to the clause containing the 

anaphor, 25) 

(A subject of a clause subordinately connected to the clause containing the 

anaphor, 23) 

(A subject of a main clause whose embedded clause contains the anaphor, 

22)} 
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Candidate enumerating rule2 

When a zero pronoun is not a ga-case component, 

{(A topic which has the weight W and the distance D, W — D * 2 — 3) 

(A focus which has the weight W and the distance D, W — D * 2 + 1)} 

Rule for Analyzing Complex Sentences 

Candidate enumerating ruleS 

When a zero pronoun is ga-case of the main (or subordinate) clause in a com- 
plex sentence, the complex sentence is connected by the conjunctive particle 
indicating the disagreement of the subjects in a complex sentence such as 
"NODE (because)" and "NARABA (if)" and the subject of the subordinate 
(or main) clause is not omitted and is followed by the particle "GA," 
{(the subject of the subordinate (or main) clause, —30)} 

For a ga-case zero pronoun of the main (or subordinate) clause in a complex 
sentence, if there is a ga-case noun phrase in the subordinate (or main) clause, the 
system commonly judges that the ga-case noun phrase is the antecedent of the ga- 
case zero pronoun. But it is known that there are conjunctive particles which pro- 
duce disagreement of subjects in a complex sentence [ Minami 74 1 [ Yoshimoto 86(| 



[Hirai BC] [Nakaiwa & Ikehara 95 1. When a complex sentence is connected by 
these conjunctive particles, the system does not judge that the noun phrase of the 
subordinate (or main) clause is the desired antecedent. Candidate enumerating 
rule3 is for this procedure. 

Rule Using Semantic Relation to Verb Phrase 

Candidate judging rulel 

When a candidate referent of a case component (a zero pronoun) does not 
satisfy the semantic marker of the case component in the case frame, it is 
given —5. 

Candidate judging rule2 
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Table 5.11: Points given from a verb- noun relationship 



Similarity Level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 


-10 


-2 


1 


2 


2.5 


3 


3.5 


4 



OJIISAN -WA JIMEN-NI KOSHI-WO-OROSHIMASHITA. 

(old man) (ground) (sit down) 

( The old man sat down on the ground.) 

YAGATE (OJIISAN-WA) NEMUTTE-SHIMAIMASHITA. 

(soon) (old man) (fall asleep) 

( He soon fell asleep.) 

Semantic Marker HUM/ANI GA(agent) NEMURU (sleep) 

Example KARE (he)/ INU (dog) GA(agent) NEMURU (sleep) 

Figure 5.3: Example of how to check semantic constraint 



A candidate referent of a case component ( a zero pronoun ) is given points 



in Table 5.11 by using the highest semantic similarity between the candidate 
referent and examples of the case component in the case frame. 



These two rules are for checking the semantic constraint between the candi- 
date referent and the verb phrase which has the candidate referent in its case com- 
ponent. Candidate judging rulel checks semantic constraints by using semantic 
markers. Candidate judging rule2 checks semantic constraints by using examples. 
We explain how to check semantic constraints in the example sentences in Figure 



In the method using semantic markers, a candidate referent is the proper 
referent if one of the semantic markers which the candidate referent has is equal 
or subordinate to the semantic marker of the case component. For example, with 



respect to the zero pronoun in Figure 5.3, since the ga-case component in the 



verb "NEMURU (sleep)" has the semantic markers HUM (human being) and ANI 
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(animal) 0, and "OJIISAN (old man)" has the semantic marker HUM, "OJIISAN" 
is judged to be the proper referent. 

In the example-based method, the validity of a candidate referent is decided 
by the semantic similarity between the candidate referent and the examples of the 
case component in the verb case frame. The higher the semantic similarity is, the 



higher the validity is. For example, with respect to a zero pronoun in Figure 5.3, 
since the examples of ga-case are "KARE (he)" and "INU (dog)" and "OJIISAN 
(old man)" is semantically similar to "KARE (he)", "OJIISAN (old man)" is the 
proper referent. 

These rules, which use semantic relations to verbs, are also used in the esti- 
mation of the referent of demonstratives and personal pronouns. 

Rule Using the Feature that it is Difficult for a Noun Phrase to 
be Filled in Plural Case Components of the Same Verb 

Candidate enumerating ruleA 

When there is "Noun X" in another case component of the verb which has 
the analyzed case component (the analyzed zero pronoun), {(Noun X, —20)} 

Rule Using Empathy 

Candidate enumerating ruleb 

When an anaphor is a ^a-case zero pronoun whose verb is followed by the 
auxiliary verbs such as "KURERU" and "KUDASARU" and there is a ni- 
case zero pronoun in the verb, the nz-case zero pronoun is analyzed first. 
With respect to the i^a-case zero pronoun, {(do not fill a zero pronoun, —5)} 
This rule is based on empathy theory pameyama { 



When an anaphor is a ga-case zero pronoun whose verb is followed by the 
auxiliary verbs such as "KURERU" and "KUDASARU," the ni-case zero pronoun 
is analyzed first, and it is filled with the noun phrase which has high empathy 
such as topic, and a ga-case zero pronoun is filled with the other noun phrase. 



® HUM and AN I are the semantic markers which indicate human being ( HUM AN) and animal 
( ANI MAL), respectively. 
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Rule for Zero Pronoun in the Quotation 

Candidate enumerating ruled 

In the quotation, when an anaphor is a ga-case zero pronoun which is easily 
filled with a first person, whose verb is such as "YARU (give)", "SHITAI 
(want)", and "IKU (go)," {(the first person, 5)} 

Candidate enumerating rule7 

In the quotation, when an anaphor is a i^a-case zero pronoun which is eas- 
ily filled with a second person, whose verb is such as "KURERU (give)", 
"NASARU (do)", and "KURU (come)", or whose verb is in an imperative or 
interrogative form, {(the first person, —30) (the second person, 25)} 

Candidate enumerating ruleS 

In the quotation, when an anaphor is a ga-case zero pronoun, 
{(the first person, 15)} 



A zero pronoun in a quotation can often be resolved by the surface expression 
of the last words in the sentence. A zero pronoun can be resolved by estimating 
the first person (speaker) or the second person (hearer) as in a personal pronoun 
\\. For example, in the next quotation, we find that the first person is "TENGU 
TACHI (tengu)" and that the second person is "OJIISAN (old man)" by checking 



^ JKudou fc Tomokiyo 93t estimates the person of a zero pronoun in a conversational corpus. 
But in this work, quotations in the novel are dealt with, and it is necessary to estimate the 
speaker and the hearer of the quotation. 
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ga-case component and nz-case component of the verb "lU (say)," 

TENGU-TACHI-WA TOUTOU KOU IIMASHITA. 

(tengu) (finally) (like this) (say) 

(The "tengu" finally said:) 

"KYOU-NO OMAE-WA DAME-DANA. 

(today) (you) (no good) 

("You're no good today.) 

(5.13) 
KORE-WO [TENGU-TACHI-GA] [OJIISAN-NI] KAESHITE-YARU-KARA 

(this) (tengu) (old man) (give back to) 

("[We] '11 give this back [to you].) 

[OJIISAN-GA] KAETTE-SHIMAE. 
(old man) (go home) 

([You should] Now go home.") 

The referent of the ga-case zero pronoun of the verb "KAESHITE YARU" is 
the first person "TENGU TACHI ('tengu's)" because "KAESHITE YARU" con- 
tains "YARU." The referent of the ni-case zero pronoun of the verb "KAESHITE 
YARU" is the second person "OJIISAN (old man)" because "KAESHITE YARU" 
contains "YARU." The referent of the ga-case zero pronoun of the verb "KAETTE 
SHIMAE" is the second person "OJIISAN (old man)" because "KAETTE SHI- 
MAE" is the imperative sentence. 

The Other Rules 

Candidate enumerating ruled 

When an anaphor is a ga-case zero pronoun of "Y DA (is Y)" in the expression 
of "X WO Y DA TO MINASU (consider X as Y)", {(Noun X, 50)} 

5.6 Experiment and Discussion 

5.6.1 Experiment 

Before pronoun resolution, sentences were transformed into a case structure by the 



case structure analyzer [Kurohashi &: Nagao 94 1 as in the experiments of the other 
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DORU SOUBA-WA KITAI-KARA 130-YEN-DAI-NI JOUSHOUSHITA. 

(dollar) (the expectations) (130 yen) (surge) 

(The dollar has since rebounded to about 130 yen because of the expectations. ) 

KONO DORU-DAKA-WA OUSHUU-TONO KANKEI-WO GIKUSHAKU-SASETEIRU. 
(the dollar's surge) (Europe) (relation) (strain) 

(The dollar's surge is straining the relations with Europe. ) 



Rule 


The score of each candidatc(points) 




the previous 
sentence 


new 130 YEN KITAI 
individual (130 yen) (expectations) 


DORUSOUBA 

(doUar) 


Candidate enumerating rule2 
Candidate enumerating ruleh 
Candidate enumerating rulel 
Candidate judging ruleQ 


15 


10 


17 
-30 


15 
-30 


15 
-30 


Total Score 


15 


10 


-13 


-15 


-15 



Figure 5.4: Example of resolving demonstrative "KONO (this)" 



chapters. The errors made by the structure analyzer were corrected by hand. We 
used IPAL dictionary! [PAL 87| as a verb case frame dictionary. We put together 
the case frames of the verb phrases which were not contained in this dictionary 
by consulting a large amount of linguistic data. 

An example of resolution of the demonstrative "KONO (this)" is shown in 



Figure 5.4. Figure 5.4 shows that the referent of the noun phrase "KONO DORU- 
DAKA (this dollar's surge)" was properly judged to be the previous sentence. 



By Candidate enumerating rule2 in Section 5.3, the system took a candidate 
"The previous sentence" and gave it 15 points. By Candidate enumerating rule5 in 



Section 5.3, the system took a candidate "New individual" and gave it 10 points. 
By Candidate enumerating rulel in Section |5^ , the system took three candidates, 
"130 YEN (130 yen)", "KITAI (expectations)", and "DORUSOUBA (dollar)", 
and gave them 17, 15, and 15 points, respectively. The system applied Candidate 
judging rule6 to them. Candidate judging rule6 uses examples of "X NO Y" . In 
this case. Candidate judging ruled used examples of "X NO DORUDAKA (the 
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Table 5.12: Result 



text 


demonstrative 


personal pronoun 


zero pronoun 


total score 


Training 


87% ( 41/ 47) 


100% ( 9/ 9) 


86%(177/205) 


87%(227/261) 


Test 


86% ( 42/ 49) 


82% ( 9/11) 


76%(159/208) 


78%(210/268) 



The point given in each rule is manually adjusted by using the training sentences. 
Training sentences {example sentences (43 sentences), a folk tale "KOBUTORI 
JIISAN" jNakao 85[ (93 sentences), an essay in "TENSEIJINGO" (26 sentences), an 
editorial (26 sentences), an article in "Scientific American (in Japanese)" (16 sen- 
tences)} 



Test sentences {a folk tale "TSURU NO ONGAESHI" |Nakao SSJ (91 sentences), two 
essays in "TENSEIJINGO" (50 sentences), an editorial (30 sentences), articles in 
"Scientific American(in Japanese)" (13 sentences)} 



Table 5.13: The detailed result of demonstrative 



text 


demonstrative 
pronoun 


demonstrative 
adjective 


demonstrative 
adverb 


total score 


Training 


83% ( 15/ 18) 


86% ( 19/ 22) 


100% ( 7/ 7) 


87%( 41/ 47) 


Test 


82% ( 14/ 17) 


88% ( 23/ 26) 


83% ( 5/ 6) 


86%( 42/ 49) 



dollar's surge of X)". The noun phrase X of this form "X NO DORUDAKA" was 
only "SAIKIN (recently)" in EDR occurrence dictionary. All three candidates, 
"130 YEN (130 yen)", "KITAI (expectations)", and "DORUSOUBA (dollar)", 
were low in similarity to "SAIKIN (recently)" in "Bun Rui Goihyou", and were 
given —30 points by Table |5.8| . Two candidate, "The previous sentence" and 
"New individual" , are not noun phrases, and were not given points by Candidate 
judging ruleG. As a result, "the previous sentence" had the highest score and was 
judged to be the proper referent. 

We show the result of our resolution of demonstratives, personal pronouns. 



and zero pronouns in Table 5.12. The detailed result of demonstrative is shown in 



Table 5.13. When a demonstrative refers to some sentences, even if the scope of 
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the referent cannot be estimated and a demonstrative can be correctly judged to 
be anaphoric or cataphoric, it is regarded as correct. This is because we think that 
the estimation of the scope of the referent should be analyzed after the analysis 
of the relation of the sentences such as cause-effect and exemplification. The 
precision rate of zero pronouns is in the case when the system knows whether the 
zero pronoun has the referent or not in advance. 

5.6.2 Discussion 

With respect to demonstratives, the precision rate was over 80% even in the test 
sentences. It indicates that the rule used in this system is effective. But since 
Japanese demonstratives are classified into many kinds, the precision may increase 
by making more detailed rules. In this work we used the feature that "KONO 
(this)" rarely functions as a daikou-reference. There were four cases analyzed 
correctly because of this rule. 

With respect to personal pronouns, since only first personal pronouns and 
second personal pronouns appeared in texts used by the experiment, almost all 
of the personal pronouns were resolved correctly by estimating the first persons 
and the second persons in the quotation. The main reason for the errors in the 
personal pronoun resolution is that ni-case zero pronoun was resolved incorrectly 
and the second person was estimated incorrectly. 

Reasons for the errors of the zero pronoun resolution are that there are errors 
in "Bunrui goi hyou", Noun Semantic Marker Dictionary, and Case Frame Dic- 
tionary, and that rules are insufficient although they can be improved by making 
new rules using syntax structures and auxiliary expressions. 

An example of errors necessary for understanding and reasoning is as follows: 

SONNA JOUKYOU NANONI, WASHINTON-DE HIRAKARERU SHUYOU-SENSHIN- 
7-KAKOKU-NO ZOUSHOU CHUUOU GINKOU SOUSAI KAIGI (G7) NI TSUITE 
KAKKOKU-NO TSUUKA TOUKYOKU-WA "OOKINA MONDAI-WA NAI-NODE 
KYOUDOU KOMINYUKE-WA DANAI. KAOAWASE CHUUSHIN-NO KAIGOU- 
DA"-TO, MARUDE KAIGI-NO IGI-WO USUMEYOU-TO-SHITEIRUYOUNA 
IIKATA-DA. 

(Despite these problems that plague the global economy, the monetary au- 
thorities of the Group of Seven nations seem to be trying to downplay the 
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upcoming G-7 meeting in Washington. The participants regard the meeting 
as just a "get-acquainted session" and have decided against issuing a joint 
communique.) 

(...) 

(omission) 

(...) 

BEI-SHINSEIKEN-WA CHIKAKU, ZAISEI AKAJI SAKUGEN-NO GUTAITEKI- 
KOKUSOU-WO GIKAI-NI SHIMESU-YOTEI-DEARU. 

(The administration will shortly indicate its specific deficit-cutting plans to 
Congress. ) 

[TSUUKA TOUKYOKU GA] KYOUDOU KOMINYUKE-NO HAPPYOU-WO 
HIKAERUNOWA, KAWASE SHIJHO-NI KADAINA KITAI-WO ATAETAKU- 
NAI-TAME-DAROU. 

(The reason for [the monetary authorities'] doing away with a joint commu- 
nique this time seems to be to avoid arousing any false hopes in the foreign 
exchange market. ) 

The fi-a-case of "HIKAERU (do away with)" in this example refers to "KAKKOKU 
NO TSUUKA TOUKYOKU (the monetary authorities)". But the system incor- 
rectly judged that the referent was "BEI-SHINSEIKEN (administration)". To 
correct result, it is necessary to understand that the thing which does away with 
a joint communique is the monetary authorities. 

5.6.3 Comparison Experiment 

As we mentioned before, we use both the example rule and the semantic marker 
rule as judging rules. To check which rule is more effective, we made a compar- 
ison between the example method and the semantic marker method. The result 



is shown in Table 5.14. The upper and lower row of this table show the accuracy 
rates for training sentences and test sentences, respectively. The rules using ex- 
amples are Candidate judging rule2,A for demonstratives. Candidate judging rule2 
for personal pronouns, and Candidate judging rule2 for zero pronouns. The rules 
using semantic markers are Candidate judging rulel,3 for demonstratives. Can- 
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Table 5.14: Result of comparison between semantic marker and example-base 



Method 1 


Method 2 


Method 3 


Method 4 


Method 5 


Demonstrative 


87% (41/47) 


83% (39/47) 


87% (41/47) 


83% (39/47) 


79% (37/47) 


86% (42/49) 


88% (43/49) 


88% (43/49) 


84% (41/49) 


86% (42/49) 


Personal pronoun 


100% (9/ 9) 


100% (9/ 9) 


100% (9/ 9) 


100% (9/ 9) 


89% (8/ 9) 


82% (9/11) 


64% (7/11) 


82% (9/11) 


55% (6/11) 


64% (7/11) 


Zero pronoun 


86%(177/205) 


83%(171/205) 


86%(176/205) 


82%(169/205)66%(135/205) 


76%(159/208) 76%(158/208) 79%(164/208) 75%(155/208)63%(131/208) 



Method 1 
Method 2 
Method 3 
Method 4 
Method 5 



Using both Semantic Marker and Example 

Using Semantic Marker 

Using Example (using modified codes of BUNRUI GOI HYOU) 

Using Example (using original codes of BUNRUI GOI HYOU) 

Using neither Semantic Marker nor Example 



didate judging rulel for personal pronouns, and Candidate judging rulel for zero 
pronouns. We used the example rules of "X NO (of) Y (Y of X)" on all of these 
comparison experiments, because there are no rules using semantic markers which 
correspond to rules of "X NO (of) Y" . The precision of the method using examples 
was equivalent or superior to the precision in the method using semantic markers 
as Table p. 14 . This indicates that we can use examples as well as semantic mark- 
ers. Since some codes in BGH are incorrect, we modified the codes. Since the 
precision using modified codes was higher than using original codes, this indicates 
that the modification of codes is valid. 

There were some cases when the example method is still effective in the expres- 
sion somewhat semantically far from those written in a case frame. For example, 
since the ni-case in the case frame of "lU (say)" is given only the semantic marker 
HUM (human), the system cannot fill "TSURU (crane bird)" in the ni-case of the 
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following example sentences by the semantic marker method. 

OJIISAN-WA TSURU-WO NIGASHI-NAGARA [TSURU-NI] IIMASHITA. 

(old man) (crane) (let loose) (to crane) (say) (5-14) 

(The old man let the crane loose, and said [to crane]. ) 

But by the example method the system can fill "TSURU (crane bird)" in the 
ni-case because the similarity level between human beings and animals is 1 and 
the subtraction of the score is low. 

5.6.4 Examining Which Rules are Important 

We used many rules in this work. We examined the importance of various rules. 

In zero pronoun resolution, the information of the semantic relation between 
verbs and case components is important because there are few key surface expres- 
sions. 

On the contrary, in demonstrative resolution, the information of the semantic 
relation between verbs and case components is not so important because there 
are many surface expressions and referents limited to things which are not hu- 
man. In demonstrative resolution, all the rules are important, because Japanese 
demonstratives are classified into many kinds and we must make many detailed 
rules. 

In first and second personal pronoun resolution, the rules using first persons 
and second persons were very effective. 

5.7 Summary 

In this chapter, we presented a method of estimating referents of demonstra- 
tive pronouns, personal pronouns, and zero pronouns in Japanese sentences using 
examples, surface expressions, topics and foci. In conventional work, semantic 
markers have been used for semantic constraints. In contrast, we used exam- 
ples for semantic constraints and showed in our experiments that examples are as 
useful as semantic markers. We also proposed many new methods for estimating 
referents of pronouns. For example, we use the form "X of Y" for estimating refer- 
ents of demonstrative adjectives. In addition to our new methods, we used many 
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conventional methods. As a result, experiments using these methods obtained a 
precision rate of 87% in the estimation of referent of demonstrative pronouns, per- 
sonal pronouns, and zero pronouns on training sentences, and obtained a precision 
rate of 78% on test sentences. 



Chapter 6 



Verb Phrase Ellipsis Resolution 



6.1 Introduction 

In the previous chapters, we have discussed anaphora resolution in Japanese noun 
phrases and pronouns. The remaining problem is anaphora resolution in Japanese 
verb phrases. Verb phrase anaphora is classified into two categories: (i) anaphora 
in pro-verbs such as "SOU SURU (do so)" and (ii) the ellipsis of a verb phrase. In 
this thesis, (i) anaphora by pro- verbs is handled already in Chapter |^ as demon- 
strative adverbs such as "SOU (so)" and "KOU (like this)" . This chapter describes 
(ii) how to resolve the verb phrase ellipsis. 

Verb phrases are sometimes omitted in Japanese sentences. It is necessary 
to resolve verb phrase ellipses for purposes of language understanding, machine 
translation, and dialogue processing. This chapter describes a practical method 
to resolve omitted verb phrases by using surface expressions and examples. In 
short, (1) when the referent of a verb phrase ellipsis appears in the sentences, we 
use surface expressions (clue words); (2) when the referent does not appear in 
the sentences, we use examples (linguistic data). We define the verb phrase to 
which a verb phrase ellipsis refers as the complemented verb phrase. For example, 
"[KOWASHITAJF] (broke)" in the second sentence of the following example is a 
verb phrase ellipsis. "KO WASHITA (broke)" in the first sentence is a comple- 



^ A phrase in brackets "p'^"]" represents an omitted verb phrase. 
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The matching part The latter part 

KON'NANI UMAKU IKUTOWA OMOENAI. 
(like this) (it succeeds) (I don't think) 

(I don't think that it succeeded like this) 

ITUMO UMAKU IKUTOWA KAGIRANAL 

(every time) (it succeeds) (cannot expect to) 

(You cannot expect to succeed every time.) 

KANZENNI UMAKU IKUTOWA lENAI. 
(completely) (it succeeds) (it cannot be said) 

(It cannot be said that it succeeds completely) 

Figure 6.1: Sentences containing "UMAKU IKUTOWA (it succeeds)" in a corpus 
(examples) 

mented verb phrase. 

KARE-WA IRONNA MONO- WO KOWASHITA. 

(he) (several things) (broke) 

(He broke several things.) 

(6.1) 
KORE-MO ARE-MO [KOWASHITA]. 

(this) (that) (broke) 

([He broke] this and that.) 

(1) When a complemented verb phrase exists in the sentences, we use surface 
expressions (clue words). This is because an elliptical sentence in the case (1) is 
in one of several typical patterns and has some clue words. For example, when 
the end of an elliptical sentence is the clue word "MO (also)", the system judges 
that the sentence is a repetition of the previous sentence and the complemented 
verb ellipsis is the verb phrase of the previous sentence. 

(2) When a complemented verb phrase does not appear in the sentences, we 
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In the same sentence Inverted sentence DARE-DESU-KA KITANOWA . 

(Who was the person that came here?) 

,In the sentences ( /Question- Answer NANI-WO KOWASHITANO. KORE-WO. 

(What did you break? [I broke] this) 

In the previous sentence ^Relation AKARUINE. DENKI-WO-TSUKETA-KARA. 

(Bright. Because I switched the Ught on.) 

Supplement NAKUSHIMONO-WO SHITA. KAGI-WO. 

(I lost things. [1 lost] keys.) 

Interrogative sentence NAMAE-WA [NANIDESUKA]. 

([What is] your name?) 

da -ellipsis WATASHl-WA GAKUSEl JDESU]. 

^Outside the sentences ^ (I [am] a student. ) 

suru-ellipsis WATASHl-WO KODOMO-ATSUKAl [SURU]. 

( He [does] treat me as child. ) 

Other ellipses (use of common sense) SOU UMAKUIKU-TOWA [OMOENAI]. 

( [I don't think] it succeed so weU. ) 



Figure 6.2: Categories of verb phrase ellipsis 



use examples. The reason is that omitted verb phrases in this case (2) are diverse 
and we use examples to construct the omitted verb phrases. The following is an 
example of a complemented verb phrase that does not appear in the sentences. 

SOU UMAKU IKUTOWA [OMOENAI] . 

(so) (succeed so well) (I don't think) (6-2) 

([I don't think] it succeeds so well. ) 

When we want to resolve the verb phrase ellipsis in this sentence "SOU UMAKU 
IKUTO WA [OMOENAI]", the system gathers sentences containing the expres- 
sion "SOU UMAKU IKUTOWA (it succeeds so well. )" from corpus as shown in 



Figure 6T, and judges that the latter part in the obtained sentence (in this case, 
"OMOENAI (I don't think)" etc.) is the desired complemented verb phrase. 



6.2 Categories of Verb Phrase Ellipsis 



We handle only ellipses in the ends of sentences. Although there are some ellipses 
in the inner part of sentences, we think that they should be solved as problem of 
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syntax and we do not deal with them. 

We classified verb phrase ellipses from the view point of machine processing. 
The classification is shown in Figure |6.2| . First, we classified verb phrase ellipses 
by checking whether there is a complemented verb phrase in the sentences or 
not. If there is a complemented verb phrase in the sentences, we classified verb 
phrase ellipses by checking whether the complemented verb phrase is in the same 
sentence or in the previous sentence. Finally, we classified verb phrase ellipses 
by meaning. "In the sentences" , "Outside the sentences" , "In the sentence" , and 



"In the previous sentence" in Figure 6.2 represent where the complemented verb 



phrase exists, respectively. Although the above classification is not perfect and 
needs modification, we think that it is useful to understand the outline of verb 
phrase ellipses in machine processing. 

The feature and the analysis of each category of verb phrase ellipsis are de- 
scribed in the following sections. 



6.2.1 When a Complemented Verb Phrase Elhpsis Appears in 
the Sentences 

Inverted Sentence 

Inverted sentences have expressions which are normally at the end of a sentence 
in the inner part of the sentence. For example, the following sentence has the 
words "DARE DESUKA (Who is)", an inverted expression normally at the end 
of a sentence. 

DARE DESUKA, KITA-NO-WA 

(who) (is) (the person that came here) (6-3) 

(Who was the person that came here?) 

Therefore, we analyze inverted sentences as followed. When a sentence has an 
expression which is normally at the end of a sentence and followed by a comma, 
the system judges the sentence to be an inverted sentence. 
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Question Answer 

In question-answer sentences verbs in answer sentences are often omitted, when 
answer sentences use the same verb as question sentences. For example, the verb 
of "KORE WO (this)" is omitted and is "KOWASHITA (break)" in the question 
sentence. 

NANI-WO KOWASHITANO 

(what) (break) 

(What did you break?) 

(6.4) 
KORE- WO [KOWASHITA]. 

(this) (break) 

([I broke] this.) 

The system judges whether the sentences are question-answer sentences or 
not by using surface expressions such as "NANI (what)", and, if so, it judges that 
the complemented verb phrase is the verb phrase of the question sentence. 

Relation 

In verb phrase ellipsis, there is a phenomenon that an elliptical sentence whose 
end is a conjunctive particle relates causatively, contrastingly or conditionally to 
the previous sentence, and they make inverted sentence across two sentences. For 
example, "DENKI-WO TSUKETA-KARA (Because I switched the hght on.)" is 
the reason for the previous sentence "AKARUINE (bright)" . The omitted element 
of "DENKI-WO TSUKETA-KARA" is "AKARUINE (bright)". 

AKARUI. 

(bright) 

(Bright.) 

(6.5) 
DENKI-WO TSUKETA-KARA. 

(the light) (switch on) 

(Because I switched the light on.) 
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When a sentence has a conjunctive particle at the end, the system normaUy 
judges that the complemented verb phrase is the verb at the end of the previ- 
ous sentence. But, there are some cases that a conjunctive particle is used for 
indicating hesitation, the sentence is not in contrast to the previous sentence. 

OKIKI-SHITE IINOKA WAKARIMASENGA. 

(ask) (whether it is all right) (do not know) (6-6) 

(Although I don't know whether you mind I ask you, ...) 

Therefore, in the case of "NONI (but)" which is easy to relate to the previous 
sentence, the system judges that the complemented verb phrase is the previous 
sentence. In the case of the other particles if the previous sentence is an inter- 
rogative sentence, the system judges that the sentence contrasts to the previous 
sentence, and otherwise, the system judges that the sentence does not contrast to 
the previous sentence and indicates a kind of feeling. 

Supplement 

In sentences which play a supplementary role to the previous sentence, verb 
phrases are sometimes omitted. For example, the second sentence is supplemen- 
tary, explaining that "the things I lost" is "keys" . 

MONO-WO NAKUSHITA. 

(things) (lost) 

(I lost things.) 

(6.7) 
KAGI-WO [NAKUSHITA.] 

(keys) (lost) 

([I lost] keys. ) 

To solve this, we present the following two methods using word meanings. The 
first method is when the word at the end of the elliptical sentence is semantically 
similar to the word of the same case element in the previous sentence, they corre- 
spond, and the omitted verb is judged to be the verb of the word of the same case 
element in the previous sentence. In this case, since "MONO (thing)" and "KAGI 
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(key)" are semantically similar in the sense that they are both objects, the system 
judges they correspond, and the verb of "KAGI (key)" is "NAKUSHITA (lost)". 
The second method is for when the same case element in the previous sentence 
is omitted. 

NAKUSHITA. 

(lost) 

(I lost.) 

(6.8) 
KAGI-WO [NAKUSHITA.] 

(keys) (lost) 

([I lost] keys. ) 

In this case, the system checks the semantic distance between "KAGI (key)" and 
the words which are easily filled in the WO case (object) of the "NAKUSU (lose)" 
by using the case frame of the verb "NAKUSU (lose)" . If they are semantically 
similar, the system judges that the omitted verb phrase is "NAKUSU (lose)". 

In addition to these methods, we use methods using surface expressions. For 
example, when a sentence has clue words such as the particle "MO" (which in- 
dicates repetition), the sentence is judged to be the supplement of the previous 
sentence. 

There are many cases when an elliptical sentence is the supplement of the 
previous sentence. In this work, if there is no clue, the system judges that an 
elliptical sentence is the supplement of the previous sentence. 

6.2.2 When a Complemented Verb Phrase does not Appear in 
the Sentences 

Interrogative Sentence 

Sometimes, in interrogative sentences, the particle "WA" is at the end of the 
sentence and the verb phrase is omitted. For example, the following sentence is 



IPAL case frame dictionary | [PAL 87 1 has the information of what kind of words can be filled 



in each case frame. In this work, we use this information. 
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an interrogative sentence and the verb phrase is omitted. 

NAMAE-WA [NANI-DESUKA.] 

(name) (what?) (6.9) 

([What is] your name?) 

If the end is of the form of "Noun + WA" , the sentence is probably an inter- 
rogative sentence, and thus the system judges it to be an interrogative sentence 



(ia-Ellipsis 

When the end of the previous sentence is a noun phrase, the copula "DA (be)" is 
often omitted. 

WATASHI-WA GAKUSEI [DESU]. 

(I) (student) (be) (6.10) 

(I [am] a student.) 

In this example, the copula "DA (be)" is omitted from the sentence "WATASHI- 
WA GAKUSEI DESU (I am a student.). 

The analysis of this case is performed by checking whether the end of the 
sentence is a noun phrase and by using syntactic structures such that there is a 
subject. 

snru-Ellipsis 

When the end of the previous sentence is a noun phrase, the basic verb "SURU 
(do)" is often omitted. 

WATASHI-WO KODOMO-ATSUKAI [SURU]. 

(I) (to treat as child) (do) (6.11) 

(He [does] treat me as child.) 



^ Since this work is verb phrase elhpsis resolution, the system must complement a verb phrase 
such as "NANI-DESUKA (what?)". But the expression of the verb phrase changes according to 
the content of the interrogative sentence and we do not deal with this problem in this work. 
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In this example, the verb "SURU (do)" is omitted from the sentence "WATASHI- 
WO KODOMO-ATSUKAI SURU. (He treats me as child.)". 

The analysis of this problem is done by checking whether the end of the sen- 
tence is a verbal noun and whether the rentoi-form modifier modifies the verbal 
noun Q 

Other Ellipses (Resolved Using Common Sense) 

In the case of "Outside the sentences" the following example exists besides "In- 
terrogative sentence", "da-ellipsis", and "sitru-ellipsis" . 

JITSU-WA CHOTTO ONEGAIGA [ARU-NO-DESUGA]. 

(the truth) (a little) (request) (I have) (6.12) 

(To tell you the truth, [I have] a request.) 

This kind of ellipsis does not have the complemented expression in sentences. The 
form of the complemented expression has various types. This problem is difficult 
to analyze. 

To solve this problem, we estimate a complemented content by using a large 
amount of linguistic data. 

When Japanese people read the above sentence, they naturally recognize the 
omitted verb is "ARIMASU (I have)". This is because they empirically have 
the sentence "JITSU-WA CHOTTO ONEGAIGA ARU-NO-DESUGA. (To tell 
the truth, I have my request.)" in their mind. When we perform the same 
interpretation using a large amount of linguistic data, we detect the sentence 
containing an expression which is semantically similar to "JITSU-WA CHOTTO 
ONEGAIGA. (To tell you the truth, (I have) a request.)", and the latter part of 
"JITSU-WA CHOTTO ONEGAIGA" is judged to be the content of the elhpsis. 
In this work, we solve this problem by using the above method. 

6.3 Verb Phrase Ellipsis Resolution System 



* A modifier is in the rentai-ioYva, when it modifies a nominal phrase. 
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6.3.1 Procedure 

In this work, verb phrase elhpses are resolved in the same framework as Chapter y. 
Before the verb phrase elhpsis resolution process, sentences are transformed into a 



case structure by the case structure analyzer [ Kurohashi &: Nagao 94 1. Verb phrase 
ellipses are resolved by heuristic rules for each sentence from left to right. Using 
these rules, our system gives possible complemented verb phrases some points, 
and it judges that the possible complemented verb phrase having the maximum 
point total is the desired complemented verb phrase. 
The heuristic rules are given in the following form. 

Condition =^ { Proposal, Proposal, .. } 

Proposal := ( Possible complemented verb phrase. Point ) 

Surface expressions, semantic constraints, referential properties, etc., are written 
as conditions in the Condition section. A possible complemented verb phrase 
is written in the Possible complemented verb phrase section. Point means the 
plausibility of the possible complemented verb phrase. 

6.3.2 Heuristic Rule 

We made 22 heuristic rules for verb phrase ellipsis resolution. We show all the rules 



in Table |6Jj. These rules are made by examining training sentences in Section |6^ 
by hand. When the system analyzes verb phrase ellipsis, it also analyzes anaphora 
in noun phrases and pronouns. The rules for this resolution are shown in Chapter 
|, Chapter|, and Chapter ||. 



For these rules a semantic marker dictionary [Watanabe et al 92] is used to 
determine whether a word means a human, time, etc. 

The value s in Rule 12 and Rule 13 is given from the semantic similarity 



between "Noun X and Noun Y" in EDR concept dictionary | EDR 95b| ]. This 



similarity is given (nz + nz)/{nx + ny), let nx stand for the number of links 
between the top node and the node of Noun X, let ny stand for the number of 
links between the top node and the node of Noun Y, let node Z stand for the 
intersection node from Noun X and Noun Y to top node, and let nz stand for the 
number of the links between the top node and the node of Noun Z[ Nagao et al 9q| . 



100 



CHAPTER 6. VERB PHRASE ELLIPSIS RESOLUTION 



MURI-MO-ARIMASENWA. 

(You may well do so.) 

HAJIMETE OAISURU-NO-DESUKARA. 

(for the first time) (I meet you) 

(I meet you for the first time) 

JITSU-WA CHOTTO ONEGAIGA (ARU-NO-DESUGA). 

(the truth) (a little) (request) (I have) 

(To tell you the truth, [I have] a request.) 



Candidate 


the end of the previous sentence 


"ARIMASU (I have)" 


Rule 16 


point 




Rule 22 




1 point 


Total score 


point 


1 point 



the latter part of the sentence containing "ONEGAI GA" 



Frequency 



ARIMASU (I have) 
ARU (I have) 



Figure 6.3: Example of verb phrase ellipsis resolution 



The corpus (linguistic data) used in Rule 22 is a set of newspapers (one year, 
about 70,000,000 characters). The method detecting a similar sentence is per- 
formed by sorting the corpus in advance and using a binary search. 



6.3.3 Example of Verb Phrase Ellipsis Resolution 



We show an example of a verb phrase ellipsis resolution in Figure |6.3|. Figure 6.3 



shows that the verb phrase ellipsis in "ONEGAI (request)" was analyzed well. 

Since the end of the sentence is not an expression which can normally be at 
the end of a sentence, Rule 1 was not satisfied and the system judged that a verb 
phrase ellipsis exists. By Rule 16 the system took the candidate "the end of the 
previous sentence". Next, by Rule 22 using corpus, the system took the candidate 
"ARIMASU (I have)". Although there are "ARU (I have)" and "ARIMASU (I 
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Table 6.1: Rule for verb phrase ellipsis resolution 





Condition 


Candidate 


Point 


Example sentence 


Rule in the case that a verb ellips 


s does not exist 


1 


When the end of the sen- 


the system judges 


30 


SONO MIZUUMI WA, KI- 




tence is a formal form of 


that a 




TANO KUNINI ATTA. 




a verb or terminal post- 


verb phrase ellip- 




(The lake was in a northern 




positional particles such as 


sis does not exist. 




country.) 




"YO" and "NE", 








2 


When the end of the sen- 


a verb phrase el- 


30 


"HAI, SENSEI." ("Yes, 




tence is a person's name or 


lipsis does not 




sir.") 




a word signifying a human 


exist. 








being, 








3 


When the end is an impera- 


the sentence is an 


30 


"SAA, MEWO 




tive form of a verb, 


im- 
perative sentence 
and a verb phrase 
ellipsis does not 
exist. 




TSUBUTTE" (Here, close 
your eyes.) 


4 


When the end is the con- 


a verb phrase el- 


5 


"CHOTTO 




junctive particle "GA" , 


lipsis does not 
exist. 




SHITSUMON-GA ARUNO 
DESUGA" (Well, I have 
some questions.) 


Rule in 


the case of "Invertec 


sentei 


ice" 


5 


When the sentence has an 


it is judged to 


10 


"DARE DESUKA, KITA- 




expression normally at the 


be an inverted 




NO-WA" ("Who was the 




end of a sentence in the in- 


sentence. 




person that came here?" ) 




ner part, 








Rule in 


the case of "Questioi 


i-Ans^ 


ver" 


6 


When the sentence has an 


the verb phrase 


5 


"CHIKAYOTTE 




expression which indicates 


at the end of 




KANSATSU SHITEMO 




a reply and the previous 


the interrogative 




IIDESHOUKA." "DOUZO, 




sentence has an expression 


sentence 




GOJIYUUNI..." ("Can I 




which indicates an interrog- 






approach and look at this?" 




ative sentence such as "KA 






"Yes, please.") 




(?)", 
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Table 6.1: Rule for verb phrase ellipsis resolution (cont.) 





Condition 


Candidate 


Point 


Example sentence 


Rule in the case of "Question-Answer" 


7 


When the previous sen- 


the verb modified 


5 


"DARE-WO KOROSHI- 




tence has an interrogative 


by the interroga- 




TANDA" "WATASHI-GA 




pronoun such as "DARE 


tive pronoun 




KATTE-ITA 




(who)" and "NANI(what)", 






SARU-WO [KOROSHITA]" 
("Who did you kill?" "[I 
killed] my monkey") 


Rule 


i in the case of "Relation" 


8 


When the end is postposi- 


the sentence is in- 


5 


"TOCHI-WO 




tional particles which indi- 


terpreted to be 




AGERU-WAKE- 




cates cause such as "NODE" 


the reason for the 




NIWA-IKANAI. SOKONI, 




and "KARA", 


previous sentence 




YASHIRO-WO TATE- 
NAKUTEWA-NARANAI- 
NODAKARA" ("We can't 
give you the lot. Because we 
must build a shrine there.") 


9 


When the end is a postpo- 


the sentence is in- 


5 


"KORE- 




sitional 


ter- 




GA AKUMA-TOWA-NEE. 




particle such as "NONI" and 


preted to contrast 




MOU-SUKOSHI 




"KEREDOMO", 


with the previous 
sentence. 




DOUDOU-TO SHITA 
MONO-KA-TO OMOTTE- 
ITA-NONI" ("This is a 
devil. Although I thought it 
was majestic." ) 


10 


When the end is a condi- 


the sentence is in- 


5 


"SORENARA, 




tional form of a verb or post- 


terpreted 




IIJANAIKA. NANIMO, 




positional particles indicat- 


to be the condi- 




KOUBAN-NI-MADE KON- 




ing conditions, 


tion of the previ- 
ous sentence. 




AKUTEMO." (It is good. 
Unless you came to the po- 
fice office.) 
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Table 6.1: Rule for verb phrase ellipsis resolution (cont.) 





Condition 


Candidate 


Point 


Example sentence 


Rule 


m the case of "Supplement" 


11 


When the end is an infinitive 


the sentence is in- 


5 


MESHITSUKAI-WA 




form of a verb, 


terpreted to be 
the supplement of 
the previous sen- 
tence and the 
verb phrase at the 
end of the pre- 
vious sentence is 
judged to be the 
complemented 
verb phrase 




HEYA-NI HAIRI, ESA-WO 
TORIKAETA. 
SHUUKURIIMU-MO 
KUWAETE 

[TORIKAETA]. (A servant 
came into the room and 
changed the pet food. [He 
changed it] with a cream 
pufl. ) 


12 


When the end is Noun X 


the verb phrase 


s * 


SUBETENO AKU-GA 




followed by a case post- 


modified by Noun 


20 


NAKUNATTEIRU. 




positional particle, there is 


Y 


-2 


GOUTOU-DA-TOKA 




a Noun Y followed by 






SAGI-DA-TOKA, 




the same case postpositional 






ARAYURU 




particle in the previous sen- 






HANZAI-GA [NAKUNAT- 




tence, and the semantic sim- 






TEIRU]. (All the evils have 




ilarity between Noun X and 






disappeared. All the crimes 




Noun Y is a value s. 






such as robbery and fraud 
[have disappeared]. ) 


13 


When the end is Noun X 


the verb phrase Y 


s * 


WATASHI-WA 




followed by a case post- 




20 


[JUUTAKU-WO] 




positional particle, there is 




_2 


DOURYOU-NI YUBISAS- 




a zero pronoun of a verb 






HITE 




phrase Y in the same case 






MISETA. OOKINA NIRE- 




element in the previous sen- 






NO-KI NO SHITA-NI ARU 




tence, and the semantic sim- 






KOHUUNA TSUKURI-NO 




ilarity between Noun X and 






JUUTAKU-WO. (I pointed 




the words which is easy to 






my colleague [to the house]. 




be filled in the zero pronoun. 






An old-fashioned house un- 




described in the case frame 






der the big elm.) 
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CHAPTER 6. VERB PHRASE ELLIPSIS RESOLUTION 



Table 6.1: Rule for verb phrase ellipsis resolution (cont.) 





Condition 


Candidate 


Point 


Example sentence 




Rule 


m the case of "Supplement" 




14 


When the end is the post- 


the verb phrase 


5 


"OTONATTE 




positional particle "MO" or 


at the end of the 




WARUI KOTO BAKARI 




there is an expression which 


same speaker's 




SHITEIRUNDAYO. YOKU 




indicates repetition such as 


previous sentence 




WAK ARAN AIKERED , 




"MOTTOMO", the repeti- 


is judged to be 




WAIRO NANTE KOTO- 




tion of 


a complemented 




MO [SHITEIRUNDAYO]." 




the same speaker's previous 


verb phrase 




( "Adults do only bad things. 




sentence is interpreted, 






I don't know, but [they do] 
bribe.") 


15 


When the previous sentence 
is an interrogative sentence, 


the verb phrase 
in the end of the 
previous sentence 


1 




16 


In all cases. 


the previous 
sentence 









Rule in th 


3 case of "Interrogative sentence" 


17 


When the end is a noun fol- 


the sentence is in- 


3 


"NAMAE-WA 




lowed by postpositional par- 


terpreted to be an 




[NANI-DESUKA]" ("[What 




ticle "WA", 


interrogative 
sentence. 




is] your name?") 




Rule 


in the case of "da-e 


lipsis" 




18 


When the end is a noun 


the system judges 


2 


"KORE-WA WATASHI- 




or a postpositional particle 


it as da-ellipsis 




NO KANCHIGAI [DESU]" 




such as "BAKARI (only)", 






("This [is] my mistake.") 




"DAKE (only)" , and there is 










a noun phrase followed by a 










postpositional particle "WA 










(topic)", "MO (subject)", 










and "GA (subject)" which 










corresponds to the subject in 










the sentence. 








19 


When the end is a noun 


the system judges 


5 


SONO TSUGI-NO NATSU 




which signifies time, 


it as da-ellipsis 




[NO-KOTO-DESUj. ([It is] 
the next summer.) 



6. 3. VERB PHRASE ELLIPSIS RESOL UTION SYSTEM 
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Table 6.1: Rule for verb phrase ellipsis resolution (cont.) 





Condition 


Candidate 


Point 


Example sentence 




Rule 


in the case of "da-ellipsis" 




20 


When the end is a noun 


the system judges 


1 


ATO- 




or a postpositional particle 


it as da-ellipsis 




WA KOUGEKI-WO MAT- 




such as "BAKARI (only)", 






SUBAKARI [DESU]. (What 




"DAKE (only)", 






I do [is] only wait for the at- 
tack. ) 




Rule 


m the case of "sMru-ellipsis" 




21 


When the end is a verbal 


the system judges 


2 


WATASHI-WO KODOMO- 




noun which is not modified 


it as SMrw-ellipsis 




ATSUKAI [SURU]. (He 




the rental modifier. 






[does] treat me like a child.) 


Rule in the case of use of common sense 


22 


When the system detects 


the expression of 


1 or 


SOU UMAKU IKUTOWA 




a sentence containing the 


the highest fre- 


9 


[OMOENAI]. ([I don't 




longest expression at the end 


quency in the lat- 




think] it will succeed.) 




of the sentence from corpus. 


ter part of the de- 








(If the highest frequency is 


tected sentences 








much higher than the sec- 










ond highest frequency, the 










expression is given 9 points. 










otherwise it is given 1 point. 

) 
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have)", the frequency of "ARIMASU (I have)" is more than the others and it was 
selected as a candidate. The candidate "ARIMASU (I have)" having the best 
score was properly judged to be the desired complemented verb phrase. 

6.4 Experiment and Discussion 



We ran the experiment on the novel "BOKKOCHAN"[ Hoshi 71 ]. This is because 
novels contain various verb ellipses. In the experiment, we divided the text into 
training sentences and test sentences. We made heuristic rules by examining 
training sentences. We tested our rules by using test sentences. We show the 
results of verb phrase ellipsis resolution in Table |6.2| . 

To judge whether the result is correct or not, we used the following evaluation 
criteria. When the complemented verb phrase is correct, even if the tense, aspect, 
etc. are incorrect, we regard it as correct. For ellipses in interrogative sentences, if 
the system estimates that the sentence is an interrogative sentence, we judge it to 
be correct. When the desired complemented verb phrase appears in the sentences 
and the complemented verb phrase chosen by the rule using corpus is nearly equal 
to the correct verb phrase, we judge that it is correct. 

6.4.1 Discussion 



As in Table 3.2 we obtained a recall rate of 84% and a precision rate of 82% in the 
estimation of indirect anaphora on test sentences. This indicates that our method 
is effective. 

The recall rate of "In the sentences" is higher than that of "Outside the sen- 
tences". For "In the sentences" the system only specifies the location of the 
complemented verb phrase. But in the case of "Outside the sentences" the sys- 
tem judges that the complemented verb phrase does not exist in the sentences 
and gathers the complemented verb phrase from other information. Therefore 
"Outside the sentences" is very difficult to analyze. 

The accuracy rate of "Other ellipses (use of common sense)" was not so high. 
But, since the analysis of the case of "Other ellipses (use of common sense)" is 
very difficult, we think that it is valuable to obtain a recall rate of 56% and a 
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Table 6.2: Result of resolution of verb phrase ellipsis 



Training sentences 



Recall 



Precision 



Test sentences 



Recall 



Precision 



Total score 



92%(129/140) 



90%(129/144) 



84%(125/148)B2%(125/152) 



In the sentences 



100% (57/57) 



85% (57/67) 



94% (64/68) 



81% (64/79) 



Inverted sentence 
Question- Answer 
Relation 
Supplement 



100% (13/13) 

100% ( 3/ 3) 

100% (24/24) 

100% (17/17) 



100% (13/14) 

100% ( 3/ 3) 

89% (24/27) 

74% (17/23) 



100% ( 8/ 8) 

-% (0/0) 

100% (33/33) 

85% (23/27) 



80% ( 8/10) 
% (0/0) 
85% (33/39) 
77% (23/30) 



Outside the sentences 



87% (72/83) 



94% (72/77) 



76% (61/80) 



84% (61/73) 



Interrogative 
rfa-ellipsis 
sttru-ellipsis 
Other ellipses 
Impossible 



sentence 100% ( 3/ 3) 

100% (54/54) 

100% ( 2/ 2) 

72% (13/18) 

0% (0/6) 



75% (3/4) 

100% (54/54) 

100% ( 2/ 2) 

76% (13/17) 

-% (0/0) 



-% (0/0) 

100% (51/51) 

— % (0/0) 

56% (10/18) 

0% (0/11) 



0% ( 0/ 3) 
96% (51/53) 

% (0/0) 
59% (10/17) 

% (0/0) 



The training sentences are used to make the set of rules in Section 6.3.2. 



Training sentences {the first half of a collection of short stories "BOKKO CHAN" 



[Hoshi 71 1 (2614 sentences, 23 stories)} 

Test sentences {the latter half of novels "BOKKO CHAN" |Hoshi 71| (2757 sentences, 

25 stories)} 

Precision is the fraction of the ends of the sentences which were judged to have verb 

phrase ellipses. Recall is the fraction of the ends of the sentences which have the 

verb phrase ellipses. The reason why we use precision and recall to evaluate is that 

the system judges that the ends of the sentences which do not have the verb phrase 

ellipses have the verb phrase ellipses and we check these errors properly. 



We made a new category "Impossible" which is not in Figure 6.2. This category 
represents when the utterance is interrupted in the middle of the sentence, or the 
reader cannot recognize the omitted content. Since they are difficult to be resolved 
and we want to properly evaluate the method of "use of common sense" , we separated 
the category from "Other ellipses (use of common sense)". 
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precision rate 59%. In both training sentences and test sentences, about half 
of all the error cases occurred because the solution proposed by the rule using 
corpus is correct and the point is lower than that of the other rule or because the 
correct answer does not have the highest frequency but the second or third highest. 
This indicates that there is room for improving the method by using corpus. We 
think that when the size of corpus becomes larger, this method becomes very 
important. Although we calculate the similarity between the input sentence and 
the example sentence in the corpus only by using simple character matching, we 
think that we must use the information of semantics and the parts of speech when 
calculating the similarity. Moreover we must detect the desired sentence by using 
only examples of the type (whether it is an interrogative sentence or not) whose 
previous sentence is the same as the previous sentence of the input sentence. 

Although the accuracy rate of the category using surface expressions is already 
high, there are some incorrect cases which can be corrected by refining the use of 
surface expressions in each rule. There is also a case which requires a new kind 
of rule in the experiment on test sentences. 

SONOTOTAN WATASHI-WA OOKINA HIMEI-WO KIITA. 

(at the moment) (I) (a scream) (hear) 

(At the moment, I heard a scream?) 

(6.13) 
NANIKA-NI OSHITSUBUSARERU-YOUNA OSOROSHII KOE-NO. 

(something) (be crushed) (fearful) (voice) 

(of a fearful voice such that he was crushed by something) 

In these sentences, "OSOROSHII KOE-NO (of a fearful voice)" is the supplement 
of "OOKINA HIMEI (a scream)" in the previous sentence. To solve this ellipsis, 
we need the following rule. 

When the end is the form of "noun X + NO (of)" and there is a 

noun Z which is semantically similar to noun Y in the examples 

(6.14) 
of "noun X + NO (of) + noun Y", the system judges that the 

sentence is the supplement of noun Z. 

We experimented on novels in order to detect various ellipses. To check what 
kind of phenomena exist in other texts, we counted the number of ellipses in 
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Table 6.3: The number of ellipses in essays in "TENSEI JINGO" 





In quotations 


Outside quotations 


Total 


Total 


5 


34 


39 


In the sentences 


1 


1 


2 


Inverted sentence 
Question- Answer 
Relation 
Supplement 




1 







1 




1 

1 


Outside the sentences 


4 


33 


37 


Interrogative sentence 
da-ellipsis 
swrtt-ellipsis 
Other ellipses 






4 




28 

5 




28 

9 



essays "TENSEI JINGO" (79 stories, 1871 sentences). The results are shown in 



Table 6.3. We find that the number of ellipses is small in essays where there are 
few conversational sentences. Although there are five cases in "Other ellipses" 
outside conversational sentences, they are all in the form of "TO + human being" 
such as " '... TAISHO-SURU' TO SHUSHOU [GA-ITTA]. ('I will take ...', [said] 
the prime minister)" . There are not many different kinds of elliptical phenomena 
in essays. 



6.5 Summary 



This chapter described a practical way to resolve omitted verb phrases by using 
surface expressions and examples. We obtained a recall rate of 84% and a precision 
rate of 82% in the resolution of verb phrase ellipsis on test sentences. The accuracy 
rate of the case of complemented verb phrase appearing in the sentences was high. 
The accuracy rate of the case of using corpus (examples) was not so high. Since 
the analysis of this phenomena is very difficult, we think that it is valuable to 
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have proposed a way of solving the problem to a certain extent. We think that 
when the size of corpus becomes larger and the machine performance becomes 
greater, the method of using corpus will become effective. 



Chapter 7 



Conclusion 



Anaphora resolution is important for language understanding, machine transla- 
tion, and dialogue processing. We resolved varieties of anaphora by using surface 
expressions and examples. We experimented on several kinds of texts to test our 
methods. The results of these experiments indicate that our methods are effective. 

7.1 Summary 

Chapter |2| described a method of determining the referential property and number 
of noun phrases in Japanese sentences using surface expressions. The referential 
property of a noun phrase is how the noun phrase denotes the referent. The 
referential property is classified into three types: generic, definite and indefinite. 
A definite noun phrase refers to a given object. An indefinite noun phrase refers 
to a new object. In English, they correspond to a noun phrase with a definite 
article and a noun phrase with an indefinite article, respectively. A generic noun 
phrase refers to all objects which the noun phrase denotes. The number of a 
noun phrase is the number of the referent denoted by the noun phrase. The 
number is classified into three types: singular, plural, and uncountable. The 
referential property and the number of a noun phrase are basic factors in anaphora 
resolution. The system can grasp the outline of the referent of the noun phrase by 
using the referential property and the number of a noun phrase. The referential 
property and the number are also useful when the system generates the article 

111 
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in translating Japanese nouns into English. Many rules for the estimation of 
the referential property and the number of a noun phrase were written in forms 
similar to rewriting rules in expert systems with scores. We obtained the correct 
recognition scores of 85.5% and 89.0% in the estimation of referential property 
and number respectively for the sentences which were used for the construction 
of our rules. We tested these rules for some other texts, and obtained the scores 
of 68.9% and 85.6%, respectively. 

Chapter ^ gave a method for estimating the referent of a noun phrase in 
Japanese sentences using referential properties, modifiers, and possessors of noun 
phrases. Since there are no articles in the Japanese language, it is difficult to 
decide whether two noun phrases have the same referent in Japanese. But we 
researched referential properties of noun phrases that correspond to articles using 
words in the sentences as in Chapter |2|. We estimated referents of noun phrases 
using these referential properties. For example if the referential property of a 
noun phrase is definite, the noun phrase can refer to a noun phrase that appears 
previously, and if the referential property of a noun phrase is indefinite, the noun 
phrase cannot refer to a noun phrase that appears previously. Furthermore we 
estimated referents of noun phrases using modifiers and possessors of noun phrases 
more precisely. As a result, we obtained a precision rate of 82% and a recall rate 
of 85% in the estimation of referent of noun phrases that have antecedents on 
training sentences, and obtained a precision rate of 79% and a recall rate of 77% 
on test sentences. We verified that it is effective to use referential properties, 
modifiers, and possessors of noun phrases through experiments. 

Chapter Q described how to resolve indirect anaphora resolution. A noun 
phrase can indirectly refer to an entity that has already been mentioned. For 
example, "There is a house. The roof is white." indicates that "the roof" is 
associated with "a house", which was mentioned in the previous sentence. This 
kind of reference (indirect anaphora) has not been studied well in natural language 
processing, but is important for coherence resolution, language understanding, and 
machine translation. When we analyze indirect anaphora, we need a case frame 
dictionary for nouns containing an information about relationships between two 
nouns. But no noun case frame dictionary exists at present. Therefore, we used 
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examples of "X of Y" and a verb case frame dictionary instead. We estimated 
indirect anaphora by using this information, and obtained a recah rate of 63% 
and a precision rate of 68% on test sentences. This indicates that the information 
of "X of Y" is useful when we cannot make use of a noun case frame dictionary. 
We made a hypothetical estimation that we can use a good noun case frame 
dictionary, and obtained the result with the recall and the precision rates of 71% 
and 82%, respectively. Finally we proposed how to construct a noun case frame 
dictionary from examples of "X of Y" . 

Chapter ^ described how to estimate the referent of a pronoun in Japanese 
sentences. It is necessary to clarify referents of pronouns in machine transla- 
tion and dialogue processing. We presented a method of estimating referents of 
demonstrative pronouns, personal pronouns, and zero pronouns in Japanese sen- 
tences using examples, surface expressions, topics and foci. In conventional work, 
semantic markers have been used for semantic constraints. On the other hand, 
we used examples for semantic constraints and showed in our experiments that 
examples are as useful as semantic markers. We also proposed many new methods 
for estimating referents of pronouns. For example, we used examples of the form 
"X of Y" for estimating referents of demonstrative adjectives. We used many 
useful conventional methods in addition to our new methods. When we experi- 
mented using these methods, we obtained a precision rate of 87% in the estimation 
of referent of demonstrative pronouns, personal pronouns, and zero pronouns on 
training sentences, and obtained a precision rate of 78% on test sentences. 

Chapter P described the method of resolving verb phrase ellipsis using surface 
expressions and examples. When a complemented verb phrase appears in the 
sentences, the structure of the elliptical sentence is commonly in a typical form and 
the resolution is done by using surface expressions. When a complemented verb 
phrase does not appear in the sentences, the system resolved the elliptical sentence 
using examples. The analysis using examples is performed by gathering sentences 
containing the expression of the end of the elliptical sentence from linguistic data 
and judging the latter part of the matching expression in the gathered sentences 
to be the desired complemented verb phrase. As a result, we obtained a recall 
rate of 84% and a precision rate of 82% in the resolution of verb phrase ellipsis 
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on test sentences. 

7.2 Future Work in Anaphora Resolution 

• Refinement of heuristic rules using large collection of sentences 

It is necessary to refine lieuristic rules in this work. Althougli the points 
(certainty value) given by heuristic rules are set in the training sentences, it 
is necessary to set them automatically by using a computational learning al- 
gorithm. At this time, we require large scale linguistic data for refinement of 
heuristic rules and learning the parameters of the points. The construction 
of the linguistic data need a syntactic structure analysis and a case structure 
analysis. But since a syntactic structure analysis and a case structure anal- 
ysis cannot be done with high accuracy at present, we cannot collect large 
amounts of linguistic data. We must improve a syntactic structure analyzer 
and a case structure analyzer before refining heuristic rules. 

• Anaphora resolution using knowledge and reasoning 

In this work, we resolved anaphora by using only information which is avail- 
able at present. But, there are problems which require knowledge and rea- 
soning as in the following example |]Nagao et al 76 1. 



KARE-WA MIZU-TO SHOKUEN-WO MAZETA. 

(he) (water) (salt) (mixed) 

(He mixed water and salt. ) 



(7.1) 



KORE-WO RUTSUBO-NI SOSOIDA. 

(this) (melting pot) (advice) (poured) 

(He poured this into the melting pot. ) 

What "KORE (this)" refers to is salty water which comes from mixing water 
and salt. To solve this problem, we need the knowledge that if we mix water 
and salt, salty water results. Solving this kind of problem requires many 
complicated analyses. Although this problem is very difficult, we must solve 
it for anaphora resolution to improve. 



Appendix A 

Rule for Referential Property 
and Number of Noun Phrase 

We have written 86 heuristic rules for the referential property and 48 heuristic 



rules for the number. All the rules are given in Table A..1 and Table A. 2 
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Table A.l: Rule for referential property 





Condition 


Indef 


Def 


Gener 


Example 


P* 


V 


P 


V 


P 


V 


1 


When a noun is a 
personal pronoun, 








1 


2 








KARE-WA 

SONO BENGOSHI-NO MUSUKO- 
NO HITORI-DESU. (He is a son of 
that lawyer.) 


2 


When a noun is an 
unique entity which 
does not have a mod- 
ifier 

such as "CHIKYU 
(the earth)", 








1 


2 








OOKU-NO HITOBITO-NO 
MOKUHYOU-WA CHIKYUU-NO 
HEIWA-DESU. (The goal of many 
groups is peace on earth.) 


3 


When 

a noun is a proper 
noun which does not 
have a modifier. 








1 


2 










4 


When a noun is mod- 
ified by a noun which 
signifies time. 


1 





1 


2 


1 





KYOU-NO GOGO-NO YOTEI- 
WA DOU-DESUKA. (What is your 
plan in the afternoon today?) 


5 


When 

a noun is "HOU (on 

the part)". 








1 





1 







6 


When a noun is fol- 
lowed by a particle 
"WA" which does not 
have a modifier. 


1 





1 


1 


1 


1 


SEKIYU-JIGYO-WA WATASHI- 
GA TE-WO DASHITAKU 
NAI JIGYO-NO HITOTSU-DESU. 
(The oil business is one business 
that I don't wish to get involved 
with.) 


7 


When a noun is ac- 
companied by a par- 
ticle (WA), and the 
predicate is in the 
past tense. 


1 





1 


3 


1 


1 


IINKAI-WA ZEN 'IN 
SONO MONDAI-WO KAIKETSU 
SURUTAME-NI SHIGOTO- 
WO SHIMASHITA. (Everyone on 
the committee worked to solve that 
problem.) 



'P: possibility, V: value 
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Table A.l: Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


8 


When a noun is ac- 
companied by a par- 
ticle (WA), and the 
predicate is not in 
the past tense, 


1 





1 


2 


1 


3 


DAIGAKU-WA KOUDO-NO 
KYOIKU-WO UKERU TOKORO- 
DESU. (A college is an institution 
of higher learning.) 


9 


When a noun is fol- 
lowed by "NIWA 
(topic)" or "DEWA 
(topic)", 


1 





1 


2 


1 


2 


MAINICHI CHUUSHOKU- 
NO TOKI-NIWA BIJINESUKAI- 

NIWA 

NAGOYAKANA HITOTOKI-GA 
ARIMASU. (There is a bit of the 
piece of the business world every 
day at lunch time.) 


10 


When a noun is fol- 
lowed by "GA 
(subject)", 


1 


2 


1 


1 


1 





KARE-NO ME-NO NAKA-NIWA 
KANASHIMI-GA ARIMASHITA. 
(There was sadness in his eves.) 


11 


When a noun has a 
coordinate noun fol- 
lowed by "GA", 


1 


2 


1 


1 


1 





HITORI-NO OTOKO-NO HITO- 

TO 

HITORI-NO ONNA-NO HITO-GA 

ANATA-NO GAISHUTSUCHUU- 

NI TAZUNETE KIMASHITA. (A 

man and a woman came to see vou 

when you were gone.) 


12 


When a noun is mod- 
ified by a pronoun. 








1 


3 








SONO JIKO-GA HASSEI-SHITE- 
KARA 

YAJIUMA-GA ATSUMATTE KI- 
MASHITA. (A crowd gathered af- 
ter the accident.) 


13 


When 

a noun is modified by 

"SUBETENO (all)". 


1 





1 





1 


2 


SUBETE-NO GEIJUTSUKA-GA 

UTSUKUSHH MONO-WO BY- 

OUSHA 

SHIYOU-TO SURU-TOWA KA- 

GIRIMASEN. (Not aU artists seek 

to portray the beautiful.) 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


14 


When a noun is mod- 
ified by "SUBETE- 
NO (all)" and is fol- 
lowed by a particle 
"GA (subject)", 


1 





1 


1 


1 


2 


SUBETE-NO GEIJUTSUKA-GA 

UTSUKUSHII MONO-WO BY- 

OUSHA 

SHIYOU-TO SURU-TOWA KA- 

GIRIMASEN. (Not all artists seek 

to portray the beautiful.) 


15 


When a noun is mod- 
ified by "DOKUJI- 
NO (of one's own)" 
or "ONAJI-NO (the 
same)", 








1 


2 








CHUUGOKUJIN-WA DOKUJI- 
NO MOJI-WO HATSUMEI SHI- 
MASHITA. (The Chinese invented 
their own writing system.) 


16 


When a noun is adja- 
cent to and modified 
by a pronoun. 


1 





1 


3 


1 





KARE-NO 

OKUSAN-WA FUJIWARAKE-NO 

SHUSSHIN-DESU. (His wife is a 

Fujiwara.) 


17 


When a noun is mod- 
ified by a pronoun. 


1 





1 


2 


1 







18 


When a noun is mod- 
ified by a word which 
indicates location 
such as "UE (the up- 
per)" and "TONARI 
(the neighbor)", 


1 





1 


2 


1 







19 


When a noun 
is a word which indi- 
cates a location such 
as "NEMOTO (the 
base)". 


1 





1 


2 


1 







20 


When 

a noun is "JIKOKU 
(one's country)" or 
"HATSU (first)", 


1 





1 


2 


1 
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Table A.l: Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


21 


When a noun is mod- 
ified by the past form 
of the verb + "ATO 
(after)", 


1 





1 


3 


1 







22 


When a noun is mod- 
ified by a word which 
indicates the superla- 
tive such as "MOT- 
TOMO (the best)" 
and "ICHIBAN (the 
first)", 








1 


2 








KOKO-NI ARU 
KURUMA-NO NAKA-DE KORE- 
WA ICHIBAN TAKAI KURUMA 
DESU. (This is the most expensive 
car in this lot.) 


23 


When a noun is mod- 
ified by an ordinal 
number, 








1 


2 








MITTSU-NO 

SHIGOTO-GA ARIMASHITA-GA 
KARE-WA NIBANME- 
NO SHIGOTO-WO HIKIUKERU 
KOTO-NI SHIMASHITA. (He was 
offered three jobs and he decided to 
take the second job.) 


24 


When a noun is as 
"HUTATSU-NO- 
UCHI-NO OOKIl- 
HOU (the bigger one 
of two things)". 








1 


2 








WATASHI-WA HUTARI 
KYOUDAI-NO-UCHI WAKAI 
HOU-DESU. (I am the vounger of 
two brothers.) 


25 


When a noun is mod- 
ified by a past pred- 
icative clause. 


1 





1 


1 


1 





KORE-WA WATASHI-GA KARE- 
KARA KARITA JISHO-DESU. 
(This is the dictionary that I bor- 
rowed from him.) 


26 


When a noun is mod- 
ified by a past pred- 
icative clause which 
contains a definite 
noun phrase followed 
by a particle such as 
"GA" or "WA", 


1 





1 


3 


1 





KORE-WA WATASHI-GA KARE- 
KARA KARITA JISHO-DESU. 
(This is the dictionary that I bor- 
rowed from him.) 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


27 


When a noun is mod- 
ified by a verb mod- 
ified by a definite 
noun plirase followed 
by a particle such as 
"GA" or "WA", 


1 


1 


1 


3 


1 





KARE-GA WATASHl- 
NI KURETA JOGEN-WA HIJOU- 
NI YAKUDAGHI-MASHITA. (The 
advice he gave me was very helpful.) 


28 


When a noun is mod- 
ified by a verb which 
contains a definite 
noun phrase followed 
by a particle such as 
"GA" or "WA", 


1 





1 


1 


1 





WATASHl- 

GA AGETA SHOUSASSHI-WO 

MADA MOTTE IMASU-KA. (Do 

you still have the booklet 1 gave 

you?) 


29 


When a noun is mod- 
ified by 
a clause which con- 
tains a definite noun 
phrase followed by a 
particle such as "NI" 
or "DE", 


1 





1 


1 


1 





KOKO-NI ARU 
KURUMA-NO NAKA-DE KORE- 
WA ICHIBAN TAKAI KURUMA- 
DESU. (This is the most expensive 
car of all the cars in this lot.) 


30 


When a noun is mod- 
i- 
fied by a verb "ARU" 
which contains a def- 
inite noun phrase fol- 
lowed by a particle 
"NI" or "DE", 


1 





1 


1 


1 





KOKO-NI ARU 
KURUMA-NO NAKA-DE KORE- 
WA ICHIBAN TAKAI KURUMA- 
DESU. (This is the most expensive 
car of all the cars in this lot.) 


31 


When a noun is mod- 
ified by a verb mod- 
ified by a definite 
noun phrase followed 
by a particle "GA" or 
"NO", 


1 





1 


2 


1 
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Table A.l: Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


32 


When a noun is adja- 
cent to and modified 
by a definite noun 
followed by a particle 
"NO", 


1 





1 


1 


1 





KARE-WA 

SONO BENGOSHI-NO MUSUKO- 
NO HITORI-DESU. (He is one of 
the sons of that lawyer.) 


33 


When a noun is mod- 
ified by a definite 
noun followed by a 
particle "NO", 


1 





1 


1 


1 





KARE-WA 

SONO BENGOSHI-NO MUSUKO- 
NO HITORI-DESU. (He is one of 
the sons of that lawyer.) 


34 


When a noun 
is modified by an ex- 
pression containing a 
pronoun. 


1 





1 


1 


1 





SEKIYU JIGYOU-WA WATASHI- 
GA TE- 
WO DASHITAKU-NAI JIGYOU- 
NO HITOTSU-DESU. (The oil 
business is a business that I don't 
wish to get into.) 


35 


When a noun is fol- 
lowed by a parti- 
cle "MADE (to)", 
"KARA (from)", or 
"HE (to)". 


1 





1 


2 


1 





SHIAWASE-SOUNA DAIANA- 
JOU-WA KEKKON-SHIKI-GA 
OWARU-TO JIIN-KARA DETE 
KIMASHITA. (A radiant Lady Di- 
ana came out of the cathedral after 
the wedding.) 


36 


When a noun is fol- 
lowed by a parti- 
cle "GA", "MADE", 
"KARA", or "HE", 
and the topic of the 
sentence is a person's 
name. 


1 





1 


2 


1 





SHIAWASE-SOU- 
NA DAIANA-JOU-WA KEKKON- 
SHIKI-GA OWARU-TO 

JIIN-KARA DETE KIMASHITA. 
(A radiant Lady Diana came out of 
the cathedral after the wedding.) 


37 


When a noun has a 
coordinate noun fol- 
lowed by a particle 
"MADE", "KARA" 
or "HE", 


1 





1 


2 


1 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


38 


When a noun is fol- 
lowed by "YOU 
(for)", 


1 





1 





1 


2 


SOUGON-NA FUJISAN-WA 
TAKUSAN-NO RYOKOUYOU- 
NO PANHURETTO-NI NIHON- 
NO SHOUCHOU-TO SHITE 
DETE IMASU. (A majesty Mt.Fuji 
appears as a symbol of Japan on 
manv brochures for travel.) 


39 


When a noun is a 
clause containing a 
generic noun phrase 
followed by a par- 
ticle "WA" and is 
not a pronoun or a 
numeral, 


1 





1 





1 


2 


DAIGAKU- 

WA KOUDO-NO KYOUIKU-WO 
UKERU TOKORO-DESU.(A col- 
lege is an institution of higher 
learning.) 


40 


When a noun is fol- 
lowed by a particle 
"WA" and it modifies 
an adjective. 


1 





1 


3 


1 


4 


KONO HEYA-NI HAITTE-KURU 
KUUKl-WA TSUMETAI-DESU. 
(The air that is being blown into 
this room is cold.) 


41 


When a noun is fol- 
lowed by a particle 
"YORI" and modi- 
fies an adjective. 


1 





1 


3 


1 


5 


KIKAI-DE SEIHUN- 
SARETA KONA-YORI ISHIUSU- 
DE TSUKURARETA KONA-NO 
HOU-GA ANATA-NIWA IINO- 
DESU. (Stone grand flour is bet- 
ter for you than machine processed 
flour.) 


42 


When a noun is fol- 
lowed by a particle 
"GA" and modifies 
an adjective "YOI 
(good)". 


1 





1 


3 


1 


6 


KIKAI-DE SEIHUN- 
SARETA KONA-YORI ISHIUSU- 
DE TSUKURARETA 
KONA-NO HOU-GA ANATA- 
NIWA YOINO-DESU. (Stone 
grand flour is better for you than 
machine processed flour.) 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


43 


When a noun is fol- 
lowed by 
a particle "GA" and 
modifies an adjective 
"SUKIDA (like)", 


1 





1 


2 


1 


3 




44 


When a noun is fol- 
lowed by a particle 
"WO" and modifies a 
verb "TANOSHIMU 
(enjoy)", 


1 





1 


2 


1 


3 


OITA JONSON-HUJIN-WA 

SOUCHO-NO 

SANPO-WO TANOSHIMI-MASU. 

(Old Mrs Johnson enjoys her early 

morning walks.) 


45 


When a noun 
is "HOU (be more ... 
than ...)" and modi- 
fies an adjective. 


1 





1 


1 


1 


4 


KIKAI-DE SEIHUN- 
SARETA KONA-YORI ISHIUSU- 
DE TSUKURARETA 
KONA-NO HOU-GA ANATA- 
NIWA IINO-DESU. (Stone ground 
fiour is better for you than machine 
processed fiour.) 


46 


When a noun is fol- 
lowed by a 
particle "TOWA" or 
"TOIUNOWA" 
which easily follows a 
generic noun phrase. 








1 





1 


2 


HONTOU-NO SHINSHI-TO lU- 
NOWA SHUKUJO-NI ITSUMO 
SHINSETSU-DESU. (The perfect 
gentleman is always courteous to a 
lady.) 


47 


When a noun is fol- 
lowed by a particle 
"WA" or "MO" and 
modifies a verb mod- 
ified by an adverb 
such as "ITSUMO 
(always)" and "IP- 
PAN (generally)". 








1 





1 


2 


SHINSHI-WA 

HUTSUU SHUKUJO-NO TAME- 
NI DOA-WO AKEMASU. (The 
gentleman usually opens the door 
for the lady.) 
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Table |A.l| : Rule for referential property (cont. 



Condition 



Indef 



Def 



Gener 



Example 



48 



49 



50 



51 



When a noun is fol- 
lowed by a particle 
"WA" or "MO" and 
modifies a verb mod- 
ified by an adverb 
such as "DENTOU 
(traditionally)", 
When a noun is fol- 
lowed by a particle 
"WA" or "MO" and 
modifies a verb mod- 
ified by a word such 
as "MUKASHI-WA 
(in earlier times)" 
and "IMA-WA (at 
present)", 

When a noun is fol- 
lowed by a particle 
"WA" or "MO" and 
modifies a verb mod- 
ified by a word such 
as "MUKASHI (in 
earlier times)" and 
"IMA (at present)". 
When a noun is fol- 
lowed by a parti- 
cle "WA" or "MO" 
and modifies a verb 
modified by a word 
followed by "DEWA 
(topic)". 







125 



Table A.l: Rule for referential property (cont.) 





Condition 


Indef 


Def 


Gener 


Example 


52 


When a noun is fol- 
lowed by a particle 
"WA", "MO", 
or "GA" and modi- 
fies a verb "DEKIRU 
(can)" or a noun fol- 
lowed by a copula 
"DA (be)", 


1 





1 


2 


1 


4 


RAKUDA-WA MIZU- 
WO NOMANAKU-TEMO NAGAI 
AIDA ARUKU-KOTO-GA DEKI- 
MASU. (A camel can go for a long 
time without water.) 


53 


When a noun is fol- 
lowed by a parti- 
cle "WA", "MO", or 
"GA" and modifies a 
progressive form of a 
verb. 


1 


2 


1 


2 


1 





KURUMA-WA MICHI-NO WAKI- 
NI CHUUSHA-SHITE ARIMASU. 
(Cars are parked along the street.) 


54 


When a noun modi- 
fies a verb modified 
by a word such as 
"ITSUMO (always)" 
and "IPPAN 
(generally)". 


1 





1 


1 


1 


2 


NIHON-DEWA 

SHINDA HITO-WA TAITEI KA- 
SOU SAREMASU. (In Japan, 
the dead are usually cremated.) 


55 


When a noun is a 
common noun or a 
verbal noun. 


1 


1 


1 





1 





KANOJO-WA TEEBURU-NO 
HOKORI-WO TORINOZOKU- 
TAME-NI HUKIN-WO TSUKAI- 
MASHITA.(She used a cloth to 
dust the table.) 


56 


When a noun is fol- 
lowed by "DEWA- 
NAI (be not)". 


1 


4 


1 


2 


1 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


57 


When a noun 
is "BAAI (when)", 
"TOKORO (where)" 
and "KOTO (that)", 


1 


1 


1 


1 


1 





SHITSUBOU- 

SHITA FOUDO-DAITOURYOU- 
WA JIBUN-GA DAITOURYOU 
SENKYO-NI YABURETA KOTO- 
WO MITOME-MASHITA. (A dis- 
appointed President Ford admit- 
ted that he was defeated in the 
election.) 


58 


When a noun is mod- 
ified by an adjective 
"ARU (a certain)". 


1 


2 














ARU 

GAKUDAN-WA SONO KOUEN- 
DE ONGAKU-WO ENSOU SHI- 
MASHITA. (A band gave a perfor- 
mance at the park.) 


59 


When a noun is mod- 
ified by a word such 
as 

"HOKA-NO (other)" 
and "BETSU-NO 
(another)", 


1 


2 
















60 


When a noun is fol- 
lowed by a copula 
"DA (be)" and it is 
not modified by a 
generic noun phrase 
followed by a particle 
"WA", 


1 


1 


1 





1 


1 


KARE-WA SONO-BENGOSHI- 
NO MUSUKO-DESU. (He is a son 
of that lawyer.) 


61 


What a noun is fol- 
lowed by a copula 
"DA (be)" and is 
modified by a generic 
noun phrase followed 
by a particle "WA" , 


1 


1 


1 





1 


1 


INU-WA YAKU-NI TATSU 
DOUBUTSU-DESU. (A dog is an 
useful animal.) 
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Table A.l: Rule for referential property (cont.) 





Condition 


Indef 


Def 


Gener 


Example 


62 


When a noun is fol- 
lowed by a copula 
"DA (be)" and is not 
modified by a generic 
noun phrase followed 
by a particle "WA" , 


1 


2 


1 





1 


1 




63 


When a noun is mod- 
ified by a numeral, 


1 


10 














SONO RESUTORAN-DEWA 
ICHINICHI-NI HITO-HUKURO- 

NO TAMANEGI- 
WO TSUKAIMASU. (That restau- 
rant uses a bag of onions a day.) 


64 


When a noun is a nu- 
meral and is not fol- 
lowed by a particle 
"WA", 


1 


10 














KARE-WA 

SONO BENGOSHI-NO MUSUKO- 
NO HITORI-DESU. (He is one of 
the sons of that lawyer.) 


65 


When a noun is a rm- 
meral and is not fol- 
lowed by a particle 

"WA", 


1 


4 


1 





1 





KARE-WA 

SONO BENGOSHI-NO MUSUKO- 
NO HITORI-DESU. (He is one of 
the sons of that lawyer.) 


66 


When a noun is mod- 
ified a bunsetsu fol- 
lowed by a particle 
"TOIU (called)", 


1 


2 


1 





1 





KURASU-NI 

IKEDA-TOIU HITO-GA HITORI 
IRU. (We have one person called 
Ikeda in our class.) 


67 


When a noun is fol- 
lowed by a particle 
"WA", "MO", "GA", 
or "WO", and it 
modifies a verb mod- 
ified by a numeral. 


1 


10 


1 





1 





SONO lE-NIWA SHININ-GA HI- 
TORI DEMASHITA. (There was a 
death in the family.) 
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Table A.l; Rule for referential property (cont. 





Condition 


Indef 


Del 


Gener 


Example 


68 


When the same noun 
appears previously in 
the same sentence 
and is indefinite, 


1 





1 


2 


1 


1 


KARE-WA JOUYOUSHA- 
TO TORAKKU-WO ICHIDAI- 

ZUTSU MOTTE IMASU- 
GA KARE-WA JOUYOUSHA-NI- 
SHIKA HOKEN-WO KAKETE 
IMASEN.(He has a car and a truck 
but onlv the car is insured.) 


69 


When the same noun 
appears previously in 
the same sentence 
and is definite, 


1 





1 


4 


1 


2 




70 


When the same noun 
appears previously in 
the same sentence 
and is generic. 


1 





1 


3 


1 


2 


KIKAI-DE SEIHUN- 
SARETA KONA-YORI ISHIUSU- 
DE TSUKURARETA KONA-NO- 
HOU-GA ANATA-NIWA IINO- 
DESU. (Stone ground flour is bet- 
ter for you than machine processed 
fiour.) 


71 


When the same noun 
appears previously in 
a coordinate struc- 
ture in the same 
sentence and is not 
generic. 


1 





1 


3 


1 





KARE-WA JOUYOUSHA- 
TO TORAKKU-WO ICHIDAI- 
ZUTSU MOTTE IMASU- 
GA KARE-WA JOUYOUSHA-NI- 
SHIKA HOKEN-WO KAKETE 
IMASEN.(He has a car and a truck 
but only the car is insured.) 


72 


When the same noun 
appears in the pre- 
vious five sentences 
and is indefinite. 


1 


1 


1 


3 


1 







73 


When the same noun 
appears in the pre- 
vious five sentences 
and is definite. 


1 





1 


4 


1 


2 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


74 


When the same noun 
appears in the pre- 
vious five sentences 
and is generic, 


1 





1 


3 


1 


2 




75 


When the same noun 
appears in a coor- 
dinate structure in 
the previous five sen- 
tences and is not 
generic, 


1 





1 


3 


1 







76 


When a noun is fol- 
lowed by a particle 
"DE" or "TO", it 
modifies a verb, and 
the noun modified by 
the verb is generic. 


1 





1 





1 


2 


KIKAI-DE SEIHUN- 
SARETA KONA-YORI ISHIUSU- 
DE TSUKURARETA KONA-NO 
HOU-GA ANATA-NIWA IINO- 
DESU. (Stone ground flour is bet- 
ter for vou than machine processed 
flour.) 


77 


When a noun is fol- 
lowed by a particle 
"GA" and modifies 
a clause containing a 
word such as "IT- 
SUMO (al- 
ways)" and "IPPAN 
(generally)". 


1 





1 


1 


1 


2 


KOKO-WA MAITOSHI 
KOUZUIGA TAKUSAN OKORU 
TOKORO-DESU. (This is an area 
where there are many floods every 
year.) 


78 


When a noun is fol- 
lowed by a particle 
"GA" and is mod- 
ified by a definite 
noun phrase followed 
by a particle "NO" , 


1 





1 


1 


1 





KOKO-NI WATASHI-NO KIPPU- 
GA ARIMASU,SHASHOU-SAN 
(Here is mv ticket , conductor.) 
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Table A.l : Rule for referential property (cont. 





Condition 


Indef 


Def 


Gener 


Example 


79 


When a noun is "HAIKEI- 
NI (background)" 
or "TAISHOU-NI (target)" 
and follows a noun followed 
by a particle "WO" , 


1 





1 





1 


2 




80 


When a noun is "HAIKEI- 
NI (background)" 
or "TAISHOU-Nl (target)" 
and modifies a verb modified 
by a noun followed by a par- 
ticle "WO", 


1 





1 





1 


2 




81 


When a noun is followed by 
a particle "NO" and modi- 
fies a proper noun, 


1 





1 





1 


1 




82 


When a noun is followed by 
a particle "NO" and modi- 
fies a noun, 


1 





1 


2 


1 


3 


OOKU-NO WAKAI 
OTOKO-NO HITO- 
TACHl-WA RIKUGUN-NI 
HEIEKI-SHIMASU. (Many 
young people serve in the 
army.) 


83 


When a noun is followed by 
a particle "TO-IU" , 


1 





1 


2 


1 





KURASU-Nl 

IKEDA-TO lUU HITO-GA 
HITORI IRU. (We have an 
Ikeda in our class.) 


84 


When a noun is "NANI 
(what)", 


1 


3 


1 





1 







85 


When a noun is followed 
by a particle "NO-YOUNA 
(such as or like)", 


1 





1 


2 


1 


3 




86 


When a noun is followed by 
a particle "WA" and modi- 
fies a numeral, 


1 


1 


1 


1 
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Table A. 2: Rule for number 





Condition 


Sing 


Plur 


Uncnt 


Example 


1 


When a 
noun is a noun predi- 
cate, and the subject 
of the noun predicate 
is singular, 


1 


3 


1 





1 





KARE-WA 

IZEN MINSHUTOU-NO TOUIN- 
DE ATTA. (He used to be 
a Democrat.) 


2 


When a 
noun is a noun predi- 
cate, and the subject 
of the noun predicate 
is plural. 


1 





1 


3 


1 







3 


When a 
noun is a noun predi- 
cate, and the subject 
of the noun predicate 
is uncountable. 


1 





1 





1 


3 


KORE-WA JUNKIN-DESU.(This 
is pure gold.) 


4 


When a noun is a sin- 
gular pronoun such 
as "KARE (he)" and 
"WATASHI (we)", 


1 


3 














KANOJO-WA KEEKI-WO IKKO 
PIKUNIKKU-HE MOTTE 
YUKIMASHITA.(She took a cake 
to the picnic.) 


5 


When a noun is a 
singular demonstra- 
tive such as "KORE 
(this)" and "ARE 
(that)", 


1 


3 


1 





1 





KOKO-NI ARU 
KURUMA-NO NAKA-DE KORE- 
WA ICHIBAN TAKAI KURUMA- 
DESU. (This is the most expensive 
car in this lot.) 


6 


When a noun is "HI- 
TORI (one person)", 
"HITOTSU (one)", 
or "IPPIKI (one)". 


1 


3 


1 





1 





KARE-WA 

SONO BENGOSHI-NO MUSUKO- 
NO HITORI-DESU. (He is one of 
the sons of that lawyer.) 


7 


When a noun is a sin- 
gular numeral. 


1 


3 


1 





1 





WATASHI- WA KONO KINJO-NO 
ICHI-KAZOKU-SAE SHIRI- 
MASEN. (I don't know a family in 
this neighborhood.) 
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Table A. 2: Rule for number (cont.) 



Condition 



Sing 



Plur 



Uncnt 



Example 



10 



11 



12 



13 



When a noun is not 
generic, 



When a noun is 
definite, 



When a noun is mod- 
ified by a demonstra- 
tive adjective such 
as "SONO (the)", 
"ANO (of that)" and 
"KONO (of this)". 
When a noun is mod- 
ified by "HITORI 
(one person)", "HI- 
TOTSU (one)", or 
"IPPIKI (one)", 

When a noun is mod- 
ified by a singular 
numeral, 



When a noun con- 
tains a prefix which is 
a singular numeral. 







KARE-WA 

SONO BENGOSHI- NO MUSUKO- 

NO HITORI-DESU. (He is one of 

the sons of that lawyer.) 

KARE-WA 

SONO BENGOSHI- NO MUSUKO- 

NO HITORI-DESU. (He is one of 

the sons of that lawyer.) 

KARE-WA 

SONO BENGOSHI- NO MUSUKO- 

NO HITORI-DESU. (He is a son of 

the lawyer.) 



KURASU-TOWA JUGYOU-WO 
ISSHO-NI TOTTE-IRU 

GAKUSEI-TACHI-NO HITOTSU- 
NO GURUUPU- DESU.(A class is 
a group of students taking a course 
together.) 

SONO RESUTORAN-DEWA 

ICHINICHI-NI HITO-HUKURO- 
NO TAMANEGI- 

WO TSUKAIMASU. (That restau- 
rant uses a bag of onions a day.) 
WATASHI-WA KONO KINNJO- 
NO ICHI-KAZOKU- SAE SHIRI- 
MASEN. (I don't know a family in 
this neighborhood.) 



133 



Table A. 2: Rule for number (cont. 





Condition 


Sing 


Plur 


Uncnt 


Example 


14 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and modi- 
fies a verb modified by a sin- 
gular numeral. 


1 


1 


1 





1 





KANOJO-WA KEEKI-WO 

IKKO 

PIKUNIKKU-HE MOTTE 

YUKIMASHITA.(She took 

a cake to the picnic.) 


15 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and modi- 
fies a verb modified by a sin- 
gular numeral. 


1 


1 


1 





1 





SANDOICCHI-NI NIKU- 
GA HITOKIRE HOSHII- 
DESU. (I'd like a slice of 
meat on my sandwich.) 


16 


When a noun is as "HITO- 
BITO (people)", 








1 


3 










17 


When a noun is modified 
by a word "SUBETE-NO 
(all)", 








1 


2 


1 





SUBETE-NO 
GEIJUTSUKA- 
GA UTSUKUSHII MONO- 
WO BYOUSHA SHIYOU- 
TO SURU-TOWA KAGIR- 
IMASEN. (Not all artists 
seek to portray beautiful- 
things.) 


18 


When a noun is modified by 
a plural numeral. 








1 


3 










19 


When a noun is modified by 
a plural numeral. 








1 


3 








KARE-WA ISSEN-NIN-NO 
CHOUSHUU-NI 
ENZETSU-WO 
SHIMASHITA. (He gave a 
speech to an audience of 
1,000 people.) 


20 


When a noun is a plural 
numeral. 








1 


2 
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Table A.2 : Rule for number (cont. 





Condition 


Sing 


Plur 


Uncnt 


Example 


21 


When a noun is a plural 
numeral, 








1 


2 










22 


When a noun is a plural 
pronoun, 








1 


3 








KAZOKU-NO HITOBITO- 

GA WAREWARE-WO 
TAZUNE-NI KIMASHITA. 
(A family came to visit us.) 


23 


When a noun is followed by 
a suffix which indicates plu- 
rality such as "TACHI" and 
"RA", 


1 





1 


3 








ISHA- 

WA BYOUNIN-TACHI-NO 
SEWA-WO SHI- 
MASU. (Doctors take care 
of patients.) 


24 


When a noun is followed by 
a particle "DE" and mod- 
ifies a verb modified by a 
generic noun phrase followed 
by a particle "WA" , 


1 





1 


2 


1 


1 


NUNOJI-WA 

SENSHOKU-KOUJOU-DE 

TSUKURARE- 

MASU. (Cloth is produced 

by textile mills.) 


25 


When a noun is followed by 
a particle "WA" or "GA", 
and modifies a verb such 
as "KOERU (be over)", 
"KOSU (be over)", and 
"TASSURU (amount to)". 


1 





1 


3 


1 







26 


When a noun is followed by 
a particle "WO" and mod- 
ifies a verb "ATSUMERU 
(gather)". 








1 


3 










27 


When a noun is followed by 
a particle "GA" and mod- 
ifies a verb such as "AT- 
SUMARU (come together)" 
and "RANRITSU SURU 
(be flooded)", 








1 


3 








SONO JIKO-GA HASSEl- 
SHITE-KARA YAJIUMA- 
GA ATSUMATTE KI- 
MASHITA. (A crowd gath- 
ered after the accident.) 
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Table A. 2 : Rule for number (cont. 





Condition 


Sing 


Plur 


Uncut 


Example 


28 


When a noun is followed by 
a particle "WO" and modi- 
fies a verb such as "SAITEN 
SURU (mark)" and "MO- 
TARASU (bring)", 


1 





1 


2 


1 







29 


When a noun is followed by 
a particle "WO" or "NI" and 
modifies a verb modified by 
"IKURADEMO (as much 
...)" or "NANKAIDEMO 
(as many times as ...)", 


1 





1 


2 


1 







30 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and modi- 
fies a verb modified a plural 
noun. 


f 





i 


2 


i 





WATASHl-WA SEN- 
SHUU HON- WO NISATSU 
YOMIMASHITA. (I read 
two books last week.) 


31 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and modi- 
fies a verb modified by a plu- 
ral noun, 


1 





1 


2 


1 







32 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and it 
modifies a verb modified by 
"OOZEI" etc.. 


f 





i 


2 


i 







33 


When a noun is Noun X in 
"Noun X NO HITORI (one 
of Noun X)", 








i 


3 








KARE-WA SONO 
BENGOSHf-NO MUSUKO- 
NO HITORI-DESU. (He is 
one of the sons of that 
lawyer.) 
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Table A. 2: Rule for number (cont. 





Condition 


Sing 


Plur 


Uncnt 


Example 


34 


When a noun follows "... 
NO ICHIBU (part of)" or 
"... NOUCHINO (of)", 


1 





1 


3 


1 


2 




35 


When a noun is followed by 
a particle "GA" and modi- 
fies a verb "SUKIDA(like)", 


1 





1 


2 


1 







36 


When a noun is followed by 
a particle "WO" and mod- 
ifies a verb "TANOSHIMU 
(enjoy)", 


1 





1 


2 


1 





OITA JONSON- 
HUJIN-WA SOUCHO-NO 
SANPO-WO TANOSHIMI- 
MASU. (Old Mrs Johnson 
enjoys her early morning 
walks.) 


37 


When a noun is an uncount- 
able noun which does not 
have a modifier, 


1 





1 





1 


3 


RAKUDA-WA MfZU-WO 
NOMANAKU-TEMO NA- 
GAI AIDA 
ARUKU-KOTO-GA DEKI- 
MASU. (A camel can go for 
a long time without water.) 


38 


When a noun is an uncount- 
able noun such as water. 


1 





1 





1 


2 


RAKUDA-WA MfZU-WO 
NOMANAKU-TEMO NA- 
GAI AIDA 
ARUKU-KOTO-GA DEKI- 
MASU. (A camel can go for 
a long time without water.) 


39 


When a noun is an un- 
countable noun modified by 
"HODO-NO (extent)" or 
"... TEKI-DA (-cal)". 


1 


2 


1 


2 


1 





KANOJO-WA SONO 
MOUJfN-GA WASURE- 
RARE-NAI HODO- 
NO MAGOKORO-NO KO- 
MOTTA SHINSETSU-WO 
SONO MOUJIN-Nl SHITE 
YARIMASHITA. 
(She showed a kindness to- 
ward the blind man that he 
never forget.) 
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Table A. 2: Rule for number (cont. 





Condition 


Sing 


Plur 


Uncut 


Example 


40 


When a noun is "MONO 
(thing)" modified by an 
adjective, 


1 





1 





1 


2 


SUBETE-NO 

GEIJUTSUKA- 

GA UTSUKUSHII MONO- 

WO BYOUSHA SHIYOU- 

TO SURU-TOWA KAGIR- 

IMASEN. (Not all artists to 

portray beautiful-things.) 


41 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and follows 
an adverb such as "TAKU- 
SAN (a lot)" and "IPPAI (a 
lot)". 


1 





1 


3 


1 


2 




42 


When a noun is followed 
by a particle "WA", "WO", 
"GA", or "MO", and mod- 
ifies a verb modified by an 
adverb such as "TAKUSAN 
(a lot)" and "IPPAI (a lot)", 


1 





1 


3 


1 


2 




43 


When a noun is modified by 
"TAKUSAN-NO (a lot of)" 
or "IPPAI-NO (a lot of)". 








1 


3 


1 


2 


SOUGON-NA 
FUJISAN-WA TAKUSAN- 
NO RYOKOUYOU-NO 
PANHURETTO-NI 
NIHON-NO SHOUCHOU- 
TO SHITE DETE IMASU. 
(A majestic Mt.Fuji appears 
as a symbol of Japan on 
manv travel brochures.) 


44 


When a noun is modified by 
"TAKUSAN-NO (a lot of)". 








1 


3 


1 


2 




45 


When a noun is followed by 
a particle "WO" and mod- 
ifies a verb "ABIRU (be 
covered)". 








1 


2 


1 


1 
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Table A. 2: Rule for number (cont.) 





Condition 


Sing 


Plur 


Uncnt 


Example 


46 


When a noun is followed by 
a particle "GA" and modi- 
fies a verb 
such as "NARABU (be in 
line)" and "ZOKUSHUTSU 
SURU (appear one after 
another)", 








1 


2 


1 


1 






47 


When a noun is followed by 
a particle "WA" and modi- 
fies a noun predicate such as 
"Noun X DA (be Noun X)", 
and Noun X is plural, 


1 





1 


5 


1 









48 


When a noun is followed by 
a particle "WA" and modi- 
fies a noun predicate such as 
"Noun X DA (be Noun X)", 
and Noun X is uncountable. 


1 





1 





1 


6 


KORE-WA 
DESU. (This 


JUNKIN- 

is pure gold.) 



Appendix B 



Rule for Pronouns 



B.l Rule for Demonstratives 

We made 50 Candidate enumerating rules and 10 Candidate judging rules for 
analyzing demonstratives. All the rules are given below. 

B.1.1 Candidate Enumerating Rule 

1. When a pronoun is a demonstrative followed by the particle "GA" and a 
non-ga-case zero pronoun is not yet recovered, the system analyzes the non- 
^a-case zero pronoun before the analysis of the demonstrative. 

2. When a pronoun is "soseries demonstrative adjective + noun a," 
{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of the noun a and which has the weight 
W and the distance D, W - D*2 + W) 

(the focus which is a subordinate of the noun a and which has the weight 
W and the distance D, W - D*2 + 10)} 

3. When a pronoun is "A;o-series demonstrative adjective + noun a," 
{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of the noun a and which has the weight 
W and the distance D, W - D + 30) 
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(the focus which is a subordinate of the noun a and which has the weight 
W and the distance D, W - D + 30)} 

4. When a pronoun is "a-series demonstrative adjective + noun a," 
{ (the noun phrase containing a noun a, 45) 

(the topic which is a subordinate of the noun a and which has the weight 
W and the distance D, W -D*OA + 30) 

(the focus which is a subordinate of the noun a and which has the weight 
W and the distance D, W - D*OA + 30)} 

5. When a pronoun is "SORE (it)/ARE (that)/KORE (this)" or a demon- 
strative adjective and the previous bunsetsu contains the expression of the 
predicative form of a verb or the expression of enumerating examples such 
as "TOKA (and so on)," {(the expression, 40)} 

6. When a pronoun is "SORE/ARE/KORE" or a demonstrative adjective, 

{( The previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 15)} 

7. When a pronoun is "KORE-WA/SORE-WA/KORE-DE/SORE-DE", is the 
first word of the sentence, and is not a case component of a verb, 

{( The previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 5)} 

8. When a pronoun is "KORE-WA/SORE-WA/KORE-DE/SORE-DE" and is 
the first word of the sentence, 

{( The previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 5)} 

9. When a pronoun is "(KORE (this)/SORE (it))(HODO (extent)/DAKE 
(only)/DEMO (even)/KOSO (just))", 

{( The previous sentence (or the verb phrase in the conditional form con- 
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taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 5)} 

10. When a pronoun is "KOUIU (hke this)", "SOUIU (like it)", "KON'NA (like 
this)", etc., 

{( the previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 5)} 

11. When a pronoun is "KON'NA (like this)", 
{(the next sentences, 20)} 

12. When a pronoun is "KON'NA (hke this)" and "KON'NA (hke this)" + noun 
is followed by a particle "NI/DE/SURA/WA/NO" , 

{(the next sentences, 1)} 

13. When a pronoun is "KON'NA (hke this)" and "KON'NA (hke this)" + noun 
is followed by a particle "WO/MO/DENAI", 

{(the previous sentences, 1)} 

14. When a pronoun is "(SONO (the)/KONO (this)) (TAME (for)/UE (in)/ 
HOKA (other)/KOTO (thing)/ BAAI (case)/TSUDO (every time))", 

{( the previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 30)} 

15. When a pronoun is "(SONO (its)/KONO (this))(IMI (meaning) /GEN'IN 
(cause)/KEKKA (result) /HAIKEI(background)/KOUKA (effect))", 

{( the previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 5)} Q 

16. When a pronoun is "ANO/SONO/AN'NA/SON'NA (hke it)" + noun which 
indicates time. 



^ This rule is based on Yanagi's inethod| Yanagi 94] 
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{( the previous sentence (or the verb phrase in the conditional form con- 
taining a conjunctive particle such as "GA (but)", " DAGA (but)", and 
"KEREDO (but)" if the verb phrase is in the same sentence), 30)} 

17. When a pronoun is "KONO/KON'NA" + noun which indicates time, 
{(the present time, 5)} 

18. When a pronoun is "(KONO/KON'NA) (CHI (place)/ KUNI (country)/ 
SHAKAI (society))", 

{(the present place, 5)} 

19. When a pronoun is "SONO (the or its)" in "Noun X TO SONO Noun (Noun 
X and the Noun)" or "Noun X YA SONO Noun (Noun X or the Noun)", 
{(NounX, 50)} S 

20. When a pronoun is "SONO(its)" in "Noun X NO(of) SONO(its) Noun", 
{(Noun X, 30)} 

21. When a pronoun is "AA (oh)/SORE/KORE/ARE" followed by a comma, 
{(it is regarded as an exclamation, 30)} 

22. When a pronoun is "SOU/KON'NA/KON'NANI/SON'NANI/SOREHODO" 
and it modifies an adjective or an adverb, 

{(Introduced as indefinite, 30)} El 

23. When a pronoun is such as "ARE-YA KORE-YA", 
{(an idiomatic expression, 50)} u 

24. When a pronoun is a demonstrative pronoun, a demonstrative adverb, or a 
demonstrative adjective, 

{(Introduce an individual, 10)} 

25. When a pronoun is a demonstrative in quotations, 
{(Introduce an individual, 5)} 

26. When a pronoun is a a-series demonstrative, 
{(Introduce an individual, 5)} 
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27. When a pronoun is "KOU/KON'NAHUUNISHITE/KOUSHITE" , 
{(the previous sentences, 25)} 

28. When a pronoun is "KOU/KON'NAHUUNISHITE/KOUSHITE", 
{(the next sentences, 26)} 

29. When a pronoun is a part of "KOU/KON'NAHUUNI" + conditional form 
or "KOU SHITE" and is not the last word in the sentence, 

{(the previous sentences, 7)} 

30. When a pronoun is "KON'NA HUUNI (like this)", and is not the last word 
in the sentence, 

{(the previous sentences, 2)} 

31. When a pronoun is "KOUDA" or "KON'NA-HUUDAN" , 
{(the next sentences, 3)} 

32. When a pronoun is a demonstrative which does not indicate location and 
the previous sentence is a quotation, {(the previous sentences, 3)} 

33. When a pronoun is a demonstrative which does not indicate location, 
{(the previous sentences, 1)} 

34. When a pronoun is a demonstrative which does not indicate location and 
the next sentence is a quotation, 

{(the next sentences, 3)} 

35. When a pronoun is a demonstrative which does not indicate location, 
{(the next sentences, 1)} 

36. What a pronoun is "AA (like that)", 
{(the previous sentence, 20)} 

37. When an anaphora is "SOU (so)/SOUSHITE (do so)/SONOYOUNI (like 
it)", 

{(the previous sentences, 30)} 
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38. When an anaphor is "SOU/SOUSHITE/SONOYOUNI" and is in the subor- 
dinate clause which has a conjunctive particle such as "GA (but)", " DAGA 
(but)", and " KEREDO (but)" or an adjective conjunction such as "YOUNI 
(as)", 

{(the main clause, 45)} 

39. When a pronoun is "KON'NANI/AN'NANI/SON'NA-HUUNI/AN'NA- HU- 
UNI" and does not modify an adjective or an adverb, 

{(the previous sentence, 25)} 

40. When a pronoun is "KOKODE (here)/SOKODE (there)" and the first word 
of the sentence, 

{(the previous sentence, 5)} 

41. When a pronoun is "KOKODE (here)/SOKODE (there)", is the first word 
of the sentence, and is not a case component of a verb, 

{(the previous sentence, 5)} 

42. When a pronoun is "KOKO (here)/SOKO (there)", 
{(the present place, 15)} 

43. When a pronoun is "KOKO (here)/SOKO (there)" + noun which indicates 
time, 

{(the present time, 50)} 

44. When a pronoun is "(ARE/KORE/SORE)(KARA (from)/MADE (to))", 
{(the present time, 15)} 

45. When a pronoun is "KOCHIRA (this gentleman)" and is in a quotation, 
{(the first person, 25)} 

46. When a pronoun is "KOCHIRA (this gentleman)" which is not in a quota- 
tion, 

{(the first person, 13)} 



This rule is based on Matsuoka's method [Matsuoka et al 95 1. 
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47. When a pronoun is "SOCHIRA (the other)" which is in a quotation, 
{(the second person, 13)} 

48. When a demonstrative is the subject of a noun/adjective predicative sen- 
tence and the predicate is a word which signifies judgment such as "JISSEKI- 
DA (result)", "ZAN'NEN-DA (unfortunate)", "KAKUJITSU-DA (sure)", 
and "...TEKI-DA (-cal)", 

{(the previous sentences, 50)} El 

49. When a demonstrative is in a subordinate clause containing "YOUNI (as)", 
"GA (but)", and "KEREDOMO (but)", 

{(the main clause, 10)} El El 

50. When a pronoun is a demonstrative pronoun or "SONO (of it) / KONO (of 
this) / ANO (of that)", 

{(A topic which has the weight W and the distance D, W — D — 2) 
(A focus which has the weight W and the distance D, W — D + 4)} 

B.l. 2 Candidate Judging Rule 

1. When a pronoun is a demonstrative pronoun and a candidate referent has 
a semantic marker HUM (human), it is given —10. We use Noun Semantic 
Marker Dictionary |Watanabe et al 92 1 as a semantic marker dictionary. 



2. When a pronoun is a demonstrative pronoun, a candidate referent is given 
the points in Table |5.3| by using the highest semantic similarity between 
the candidate referent and the codes {5200003010 5201002060 5202001020 
5202006115 5241002150 5244002100} in BGH | |NLRI 64U which signify hu- 
man beings. 

3. When a pronoun is "KOKO (here) / SOKO (there) / ASOKO (over there)" 
and a candidate referent has a semantic marker LOC, which indicates loca- 
tion, the candidate referent is given 10 points. 

4. When a pronoun is "KOKO/SOKO/ASOKO", a candidate referent is given 
the points in Table |5.5| by using the semantic similarity between the candi- 
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Table B.l: Point given by the similarity of the verb 



Similarity level 





1 


2 


3 


4 


5 


6 


Exact Match 


Point 








1 


1.5 


2 


3 


3.5 


4 



date referent and the codes {6563006010 6559005020 9113301090 9113302010 



6471001030 6314020130} which indicate locations in BGH [|NLRI 64| ] 



5. When a pronoun is a so-series demonstrative adjective, the system consults 
examples of the form "noun X NO noun Y" whose noun Y is modified by 



the pronoun, and gives a candidate referent the point in Table 5.6 by the 
similarity between the candidate referent and noun X. The Japanese Co- 



occurrence Dictionary! EDR 95c ] serves as a source of examples for "X NO 
Y". 

6. When a pronoun is a non-so-series demonstrative adjective, the system con- 
sults examples of the form "Noun X NO (of) Noun Y (Y of X)" whose Noun 
Y is modified by the pronoun, and gives a candidate referent the point in 



Table 5.8 by the similarity between the candidate referent and noun X. 



7. When a candidate referent of a pronoun does not satisfy the semantic marker 
of the case component in the case frame, it is given —5. 



8. A candidate referent of a pronoun is given the points in Table 5.111 by using 



the highest semantic similarity between the candidate referent and examples 
of the case component in the case frame. 

9. When a pronoun is a demonstrative followed by "GA Noun X NI-NARU 



(become Noun X)", it is given the points in Table 5.11 by using the semantic 
similarity between the candidate referent and Noun X. 



10. When a pronoun is given the points in Table B.l by using the semantic 
similarity between the verb modified by the demonstrative and the verb 
modified by a candidate referent. 
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B.2 Rule for Personal Pronouns 

We made 4 Candidate enumerating rules and 6 Candidate judging rules for ana- 
lyzing personal pronouns. All the rules are given below. 

B.2.1 Candidate Enumerating Rule 

1. When an anaphor is a first personal pronoun, 
{(the first person (the speaker) in the context, 25)} 

2. When an anaphor is a second personal pronoun, 
{(the second person (the hearer) in the context, 25)} 

3. When an anaphor is a third personal pronoun, 
{(a first person, —10) (a second person, —10)} 

4. When an anaphor is a personal pronoun, 

{(a topic which has the weight W and the distance D, W — D — 2) 
(a focus which has the weight W and the distance D, W — D + 4:)} 

B.2. 2 Candidate Judging Rule 

1. When an anaphor is a personal pronoun and a candidate referent has a 
semantic marker HUM, the candidate referent is given 10 points. 

2. When an anaphor is a personal pronoun, a candidate referent is given the 
points in Table 5.10| by using the semantic similarity between the candi- 



date referent and the code {5200003010 5201002060 5202001020 5202006115 
5241002150 5244002100} which indicates human being in BGH IPMLRI 64 |. 



3. When a candidate referent of a personal pronoun does not satisfy the se- 
mantic marker of the case component in the case frame, it is given —5. 



4. A candidate referent of a personal pronoun is given the points in Table 5.11 
by using the highest semantic similarity between the candidate referent and 
examples of the case component in the case frame. 
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5. When a pronoun is a personal pronoun followed by "GA Noun X NI-NARU 



(become Noun X)", it is given the points in Table 5.11 by using the semantic 



similarity between the candidate referent and Noun X. 



6. When a pronoun is given the points in Table B.l by using the semantic 
similarity between the verb modified by the demonstrative and the verb 
modified by a candidate referent. 

B.3 Rule for Zero Pronouns 

We made 19 Candidate enumerating rules and 4 Candidate judging rules for ana- 
lyzing zero pronouns. All the rules are given below. 

B.3.1 Candidate Enumerating Rule 

1. When an anaphor is a i^a-case zero pronoun whose verb is followed by the 
auxiliary verbs such as "KURERU" and "KUDASARU" and there is a ni- 
case zero pronoun in the verb, the nz-case zero pronoun is analyzed first. 
With respect to the ga-case zero pronoun, {(do not fill a zero pronoun, —5)} 

2. When a zero pronoun is not in a quotation and is a case component of a 
verb whose ga-case is easily filled by a first person (speaker) such as "OMOU 
(think)" and "HOSHII (want)", {(a first person, 50)} 

3. In a quotation, when an anaphor is a ga-case zero pronoun which is easily 
filled with a first person, whose verb is such as "YARU (give)", "SHITAI 
(want)", and "IKU (go)," {(the first person, 5)} 

4. When a zero pronoun is a ga-case zero pronoun which is not easily filled with 
a first person, whose verb is such as "DAROU" , "YOUDA" , and "SOUDA" , 
{(the first person, —20)} 

5. In a quotation, when an anaphor is a ga-case zero pronoun which is eas- 
ily filled with a second person, whose verb is such as "KURERU (give)", 
"NASARU (do)", and "KURU (come)", or whose verb is in an imperative 
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sentence or an interrogative sentence, 

{(the first person, — 30)(the second person, 25)} 

6. In a quotation, when an anaphor is a ga-case zero pronoun, 
{(the first person, 15)} 

7. When an anaphor is a ga-case zero pronoun of "Y DA (is Y)" in the expres- 
sion of "X WO Y DA TO MINASU (consider X as Y)", 

{(NounX, 50)} 

8. When a zero pronoun is the subject of a noun predicative sentence and the 
predicate is "KU (phrase)", "HAIKU (haiku)", "UTA (song)" and "TANK A 
(tanka)", 

{(the previous sentence, 25)} lil 

9. When a zero pronoun is the subject of a noun predicative sentence and the 
predicate is a word which indicates time, 

{(the time of the previous sentence, 25)} 

10. When a zero pronoun is a ga-case of the main (or subordinate) clause in 
a complex sentence, the complex sentence is connected by the conjunctive 
particle indicating disagreement of subjects in a complex sentence such as 
"NODE (because)" and "NARABA (if)" and the subject of the subordinate 
(or main) clause is not omitted and is followed by the particle "GA," 
{(the subject of the subordinate (or main) clause, —30)} 

11. When a zero pronoun is the subject of a noun predicative sentence and the 
predicate is a word which indicates action, 

{(the previous sentence, 21)(the next sentence, 21)} 

12. When the next sentence is a quotation, 
{(the next sentence, 1)} 

13. When a zero pronoun is a ga-case component, 

{(A topic which has the weight W and the distance D, W — D *2 + 1) 
(A focus which has the weight W and the distance D, W — D + 1) 



150 APPENDIX B. RULE FOR PRONOUNS 

(A subject of a clause coordinately connected to the clause containing the 

anaphor, 25) 

(A subject of a clause subordinately connected to the clause containing the 

anaphor, 23) 

(A subject of a main clause whose embedded clause contains the anaphor, 

22)} 

14. When a zero pronoun is not a ga-case component, 

{(A topic which has the weight W and the distance D, W — D *2 — 3) 
(A focus which has the weight W and the distance D, W — D *2 + 1)} 

15. When there is "Noun a" in another case component of the verb which has 
the analyzed case component (the analyzed zero pronoun), 

{(Noun a, -20)} 

16. When a zero pronoun is a case component of a verb which modifies a noun 
phrase and is not modified by any phrase, 

{(the system does not analyze the zero pronoun, 3)} 

17. When a zero pronoun is an optional case component, 
{(the system does not analyze the zero pronoun, 3)} 

18. When a zero pronoun is a ga-case component, 

{(the system does not analyze the zero pronoun, 15)} 

19. When a zero pronoun is not a ga-case component, 
{(the system does not analyze the zero pronoun, 18)} 

B.3.2 Candidate Judging Rule 

1. When a candidate referent of a case component (a zero pronoun) does not 
satisfy the semantic marker of the case component in the case frame, it is 
given —5. 

2. A candidate referent of a case component ( a zero pronoun ) is given the 



points in Table 5.11 by using the highest semantic similarity between the 



candidate referent and examples of the case component in the case frame. 
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3. When a zero pronoun is a subject of "GA Noun X NI-NARU (become Noun 



X)", it is given the points in Table 5.11 by using the semantic similarity 



between the candidate referent and Noun X. 



4. When a zero pronoun is given the points in Table B.l by using the semantic 
similarity between the verb having the zero pronoun and the verb modified 
by a candidate referent. 
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